LNCS 2996 



I Volker Diekert 

Michel Habib (Eds.) 



STAGS 2004 

21st Annual Symposium 

on Theoretical Aspects of Computer Science 

Montpellier, France, March 2004, Proceedings 



Springer 




Lecture Notes in Computer Science 2996 

Edited by G. Goes, J. Hartmanis, and J. van Leeuwen 




Springer 

Berlin 
Heidelberg 
New York 
Hong Kong 
London 
Milan 
Paris 
Tokyo 




Volker Diekert Michel Habib (Eds.) 



STAGS 2004 



21st Annual Symposium 
on Theoretical Aspects of Computer Science 
Montpellier, France, March 25-27, 2004 
Proceedings 




Springer 




Series Editors 



Gerhard Goos, Karlsruhe University, Germany 
Juris Hartmanis, Cornell University, NY, USA 
Jan van Leeuwen, Utrecht University, The Netherlands 

Volume Editors 

Volker Diekert 
Universitat Stuttgart, EMI 
Universitatsstr. 38, 70569 Stuttgart, Germany 
E-mail; diekert@informatik.uni-stuttgart.de 

Michel Habib 

LIRMM, Universite Montpellier II 

161, rue Ada, 34392 Montpellier, Cedex 5, France 

E-mail: habib@lirmm.fr 



Cataloging-in-Publication Data applied for 

A catalog record for this book is available from the Library of Congress. 

Bibliographic information published by Die Deutsche Bibliothek 

Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; 

detailed bibliographic data is available in the Internet at <http://dnb.ddb.de>. 



CR Subject Classification (1998): F, E.l, 1.3.5, G.2 
ISSN 0302-9743 

ISBN 3-540-21236-1 Springer- Verlag Berlin Heidelberg New York 



This work is subject to copyright. All rights are reserved, whether the whole or part of the material is 
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, 
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication 
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, 
in its current version, and permission for use must always be obtained from Springer- Verlag. Violations are 
liable for prosecution under the German Copyright Law. 

Springer- Verlag is a part of Springer Science+Business Media 

springeronline.com 

© Springer-Verlag Berlin Heidelberg 2004 
Printed in Germany 

Typesetting: Camera-ready by author, data conversion by PTP-Berlin, Protago-TeX-Production GmbH 
Printed on acid-free paper SPIN: 10991633 06/3142 5 4 3 2 1 0 




Preface 



The Symposium on Theoretical Aspects of Computer Science (STAGS) is alter- 
nately held in France and in Germany. The conference of March 25-27, 2004 
at the Corum, Montpellier was the twenty-first in this series. Previous meetings 
took place in Paris (1984), Saarbriicken (1985), Orsay (1986), Passau (1987), 
Bordeaux (1988), Paderborn (1989), Rouen (1990), Hamburg (1991), Cachan 
(1992), Wurzburg (1993), Caen (1994), Miinchen (1995), Grenoble (1996), Liibeck 
(1997), Paris (1998), Trier (1999), Lille (2000), Dresden (2001), Antibes (2002), 
and Berlin (2003). 

The symposium looks back at a remarkable tradition of over 20 years. The 
interest in STAGS has been increasing continuously during recent years and has 
turned it into one of the most significant conferences in theoretical computer 
science. The STAGS 2004 call for papers led to more than 200 submissions from 
all over the world. 

The reviewing process was extremely hard: more than 800 reviews were done. 
We would like to thank the program committee and all external referees for the 
valuable work they put into the reviewing process of this conference. 

We had a two-day meeting for the program committee in Montpellier during 
November 21-22, 2003. Just 54 papers (i.e., 27% of the submissions) could be 
accepted, as we wanted to keep the conference in its standard format with only 
two parallel sessions. This strict selection guaranteed the very high scientific 
quality of the conference. 

We would like to thank the three invited speakers Glaire Kenyon (Ecole 
Polytechnique, Palaiseau), Erich Gradel (RWTH, Aachen), and Robin Thomas 
(Georgia Institute of Technology, Atlanta) for presenting their recent results at 
STAGS 2004. 

Special thanks for the local organization go to Ghristophe Paul and Geline 
Berger, who spent a lot of time and effort, and did most of the organizational 
work, and thanks also go to the Ph.D. students of the graph algorithms project 
for their support. We acknowledge the financial support STAGS 2004 received 
from the GNRS, the Montpellier Agglomeration, the Languedoc Roussillon re- 
gion, the University of Montpellier, and the LIRMM. 



Montpellier, January 2004 



Volker Diekert 
Michel Habib 
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Approximation Schemes for Metric Clustering 

Problems 



Claire Kenyon 

LIX, Ecole Polytechnique, France. 



Problem statement and motivation. The problem of partitioning a data 
set into a small number of clusters of related items has a crucial role in many 
information retrieval and data analysis applications, such as web search and 
classification [5,6,22,11], or interpretation of experimental data in molecular 
biology [21], 

We consider a set K of n points endowed with a distance function S. These 
points have to be partitioned into a fixed number k of subsets Ci,C 2 , ■ ■ ■ ,Ck 
so as to minimize the cost of the partition, which is defined to be the sum over 
all clusters of the sum of pairwise distances in a cluster. We call this problem 
/c-Clustering. In the settings that we consider, this optimization problem is NP- 
hard to solve exactly even for k = 2 (using arguments similar to those in [8, 

7 ]). 



The fc-Clustering problem was proposed by Sahni and Gonzalez [19] in the 
setting of arbitrary weighted graphs. Unfortunately, only poor approximation 
guarantees are possible [16,12]. Guttman-Beck and Hassin [14] initiated the study 
of the problem in metrics. Schulman [20] gave probabilistic algorithms for £2 k- 
Clustering. 

The results which we present. The results presented deal with the case 
that S is an arbitrary metric. We first present polynomial time algorithms which 
for every fixed e > 0 compute partitions into two parts which maximize the sum 
of intercluster distances: this objective function that is the complement of Metric 
2-Glustering (polynomial time approximation scheme for metric Max Gut [8]). 
We next present a polynomial time algorithm for 2-clustering [15] which uses 
the Metric Max Gut approximation scheme. Finally, we present an extension 
to algorithms for every fixed integer k and for every fixed e > 0 that compute 
a partition into k clusters of cost at most 1 -I- e times the cost of an optimum 
partition [9]. The running time is 0{f{k, Note that Bartal, Gharikar, and 

Raz [4] gave a polynomial time approximation algorithm with polylogarithmic 
performance guarantees for Metric fc-Glustering where k is arbitrary (i.e., part 
of the input). 

It is interesting to note that both Schulman’s algorithm for /c-Glustering and 
the algorithm of Fernandez de la Vega and Kenyon for Mertic Max Gut use a 
similar idea of sampling data points at random from a biased distribution that 
depends on the pairwise distances. In recent research on clustering problems. 
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sampling has been the core idea in the design of provably good algorithms for 
various objective functions. Examples include [2,1,17]. 



The Metric /c-Clustering algorithm 

1. By exhaustive search, guess the optimal cluster sizes ni > U 2 > ■ ■ ■ > Uk- 

2. By exhaustive search, for each pair of large cluster indices i and j, guess whether 
C* and Cj are close or far. 

3. Taking the equivalence relation which is the transitive closure of the relation 
“C* and Cj are close” , define a partition of large cluster indices into groups. 

4. For each large cluster C* , let d be a random uniform element of V . Assign each 
point X € V to the group G which minimizes mmi^G\ni5{x, d)]. 

5. By exhaustive search, for each group G thus constructed, guess |G n S'], where 

S = small*^** union of small clusters. For each x assigned to group G, 

let f{x) — miriieG S{x,d)- Remove from G’s assignment the |Gn S| elements 
with largest value f{x). 

6. Partition each group of large clusters into the appropriate number h of clusters 
using the PTAS for Max-/i-Cut with error parameter e' = e^e®-’“/(3fc®). 

7. Recursively partition the removed elements into the appropriate number of clus- 
ters. 
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Positional Determinacy of Infinite Games* 



Erich Gradel 
Aachen University 



Abstract. We survey results on determinacy of games and on the exis- 
tence of positional winning strategies for parity games and Rabin games. 
We will then discuss some new developments concerning positional de- 
terminacy for path games and for Muller games with infinitely many 
priorities. 



1 Introduction 

1.1 Games and Strategies 

We study infinite two-player games with complete information, specified by an 
arena (or game graph) and a winning condition. An arena Q = (V, Vq,Vi, E, f2), 
consists of a directed graph (U, E), equipped with a partioning V = VoUVi of 
the nodes into positions of Player 0 and positions of Player 1, and a function 
n : V ^ C that assigns to each position a priority (or colour) from a set C. 
Although the set V of positions may be infinite, it is usually assumed that C is 
finite. (We will drop this assumption in Section 4 where we discuss games with 
infinitely many priorities.) 

In case {v,w) G E we call w a successor of v and we denote the set of 
all successors of v by vE. To avoid tedious case distinctions, we assume that 
every position has at least one successor. A play of Q is an infinite path vqVi . . . 
formed by the two players starting from a given initial position vq. Whenever the 
current position Vi belongs to Vo, then Player 0 chooses a successor G ViE, if 
Vi G Vi, then Vi+i G ViE is selected by Player 1. The winning condition describes 
which of the infinite plays vqVi . . . are won by Player 0, in terms of the sequence 
f2{vo)f2{vi) ... of priorities appearing in the play. Thus, a winning condition is 
given by a set W C of infinite sequences of priorities. 

Winning conditions can be specified in several ways. In the theory of Gale- 
Stewart games as developed in descriptive set theory, the winning condition is 
just an abstract set W C {0, 1}“. In computer science applications winning 
conditions are often specified by formulae from a logic on infinite paths, such 
as LTL (linear time temporal logic), FO (first-order logic), or MSO (monadic 
second-order logic) over a vocabulary that uses the linear order < and monadic 
predicates Pc for each priority c G C. Of special importance are also Muller 
conditions, where the winner of a play depends on the set of priorities that have 
been seen infinitely often. 

* This research has been partially supported by the European Community Research 
Training Network “Games and Automata for Synthesis and Validation” (games) 
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A (deterministic) strategy for Player u is a partial function f : V*Va ^ V 
that maps initial segments v^vi ■ . - Vm of plays ending in a position Vm G Kr to 
a successor f{vQ...Vm) G VmE- A play VqVi--- € is consistent with f, if 
Player cr always moves according to /, i.e., if Vm+i = /(fo • • -Vm) for every m 
with Vm G Va-. We say that such a strategy / is winning from position vq, if 
every play that starts at vo and that is consistent with /, is won by Player cr. 
The winning region of Player cr, denoted Wr, is the set of positions from which 
Player cr has a winning strategy. 



1.2 Determinacy 

A game Q is determined if Wq U Wi = V, i.e., if from each position one of 
the two players has a winning strategy. On the basis of the axiom of choice 
it is not difficult to prove that there exist nondetermined games. The classical 
theory of infinite games in descriptive set theory links determinacy of games 
with topological properties of the winning conditions. Usually the format of 
Gale-Stewart games is used where the two players strictly alternate, and in each 
move a player selects an element of {0, 1}; thus the outcome of a play is an 
infinite string tt G {0, 1}“. Gale-Stewart games can be viewed as graph game, 
for instance on the infinite binary tree, or on a bipartite graph with four nodes. 
Zermelo [21] proved already in 1913 that if in each play of a game, the winner is 
determined already after a finite number of moves, then one of the two players 
has a winning strategy. In topological terms the winning sets in such a game are 
clopen (open and closed). Before we mention further results, let us briefly recall 
some basic topological notions. 

Topology. We consider the space of infinite sequences over a set B, endowed 
with the topology whose basic open sets are 0{x) := x ■ 5“, for x G B* . A set 
L C is open if it is a union of sets 0{x), i.e., \i L =W ■ for some W Q B* . 
A tree T C is a set of finite words that is closed under prefixes. It is easily 
seen that L C is closed (i.e., the complement of an open set) if L is the set of 
infinite branches of some tree T C B*, denoted L = [T], This topological space 
is called Cantor space in case B = {0,1}, and Baire space in case B = u). 

The class of Borel sets is the closure of the open sets under countable union 
and complementation. Borel sets form a natural hierarchy of classes 17° for 
1 < ?7 < Wi, whose first levels are 



17° 


(or G) : 


the open sets 


77? 


(or F) : 


the closed sets 


^2 


(or Fa) ■■ 


countable unions of closed sets 


n°2 


(or Gs) ■■ 


countable intersections of open sets 



In general, iT° contains the complements of the 17°-sets, is the class of 

countable unions of iT°-sets, and 17° = Ur;<A limit ordinals A. 
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In the 1950s, Gale and Stewart showed that all open games and all closed 
games are determined. This was then extended in several papers to higher levels 
of the Borel hierarchy, until Martin [14] proved in 1975 that in fact all games 
with Borel winning conditions are determined. The theory of infinite games from 
there branched in several directions. In descriptive set theory one aims to prove 
determinacy result for stronger, non-Borel games. For game theory that relates 
to computer science, determinacy is just a first step in the analysis of a game. 
Rather than in the mere existence of winning strategies, one is interested in 
reasonably simple winning strategies that can be effectively constructed, are 
computationally efficient and do not require too much memory. We will focus on 
this last aspect here. 

1.3 Positional Determinacy 

In general, winning strategies can be very complicated. It is of interest to deter- 
mine which games admit simple strategies, in particular finite memory strategies 
and positional strategies. While positional strategies only depend on the current 
position, not on the history of the play, finite memory strategies have access to 
bounded amount of information on the past. Finite memory strategies can be 
defined as strategies that are realisable by finite automata. 

More formally, a strategy with memory M for Player cr is given by a 
triple (mQ,U,F) with initial memory state mo G M, a memory update func- 
tion U : M X V M and a next-move function F : Vcr x M ^ V. Ini- 
tially, the memory is in state mo and after the play has gone through the 
sequence voVi...Vm the memory state is u{vo...Vm), defined inductively by 
u{vo ■ ■ ■ VmVm+i) = U {u{vo ■ ■ ■ Vm) , Vm+i) ■ M case Vm G Kr, the next move 
from vi . . . Vm, according to the strategy, leads to F{vm, u{vo ■ ■ ■ , Vm))- In case 
M = {mo}, the strategy is positional; it can be described by a function 
F-.Va^V. 

We will say that a game is positionally determined if it is determined and 
both players have positional winning strategies on their winning regions. 

2 Muller Games, Streett-Rabin Games, and Parity 
Games 

Parity games are graph games with priority labeling f2 : V — >■ |0,...,c?} for 
some d G N and parity winning condition: Player 0 wins a play tt if the least 
priority occurring infinitely often in tt is even. Parity games are of importance 
for several reasons [19,20]. 

— Parity games are positionally determined. This has been first established by 
Mostowski [16] and by Emerson and Jutla [5]. An immediate consequence 
of positional determinacy is that winning regions of parity games can be 
decided in NP fl Co-NP. 

— Many complicated games can be reduced to parity games (over larger game 
graphs). 
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~ Parity games arise as the model checking games for fixed point logics. In 
particular the model checking problem for the modal /r-calculus can be solved 
in polynomial time if, and only if, winning regions for parity games can be 
decided in polynomial time. 

Parity games are a special case of Muller games. 

Definition 1 A Muller condition over a finite set C of priorities is written in 
the form where C V{C) and = V{C) — Tq. A play tt in a game 

with Muller winning condition {Tq,!Fi) is won by Player a if, and only if, Inf(7r), 
the set of priorities occurring infinitely in tt, belongs to 

The Zielonka tree for a Muller condition over C is a tree Z{Tq,Ti) 

whose nodes are labelled with pairs (A, cr) such that X & We define 
Z{Tq,Ti) inductively as follows. Let C ^ and Cq,... ,Cfc_i be the maxi- 
mal sets in {X C C : X G Then Z{To,Ti) consists of a root, labeled 

by (C, a), to which we attach as subtrees the Zielonka trees Z{Tq fl V{Ci),T\ fl 
'P{Ci)), for i = 0, . . . ,k — 1. 

It has been proved by Gurevich and Harrington [8] that Muller games are 
determined via finite memory strategies. However, Muller games need not be 
positionally determined, not even for solitaire games (where only one player 
moves). Consider, for instance, the game with three positions 1, 2, 3, all belonging 
to Player 0, with possible moves (1, 2), (2, 1), (1, 3), (3, 1), and winning condition 
Aq = {{1,2,3}} (i.e., all three positions must be seen infinitely often). Clearly 
Player 0 can win this game, but not with a positional strategy. 

Besides parity games there are other important special cases of Muller games. 
Of special relevance for us are games with Rabin and Street conditions because 
these are positionally determined for one player [11]. 

Definition 2 A Streett-Rabin condition is a Muller condition (Aq, Ti) such that 
Tq is closed under union. 

In the Zielonka tree for a Streett-Rabin condition, the nodes labeled (A, 1) 
have only one successor. We remark that in the literature, Streett and Rabin 
conditions are often defined in a different manner, based on a collection {{Ei, Fi) : 
i = 1, . . . fc} of pairs of sets. However, it is not difficult to see that the definitions 
are equivalent [22]. Further, it is also easy to show that if both jFg and 
are closed under union, then {Eq,Ti) is equivalent to a parity condition. The 
Zielonka tree for a parity condition is just a finite path. 

In a Streett-Rabin game. Player 1 has a positional wining strategy on his 
winning region. On the other hand. Player 0 can win, on his winning region, 
via a finite memory strategy, and the size of the memory can be directly read 
of from the Zielonka tree. We present an elementary proof of this result. The 
exposition is inspired by [4]. In the proof we use the notion of an attractor. 

Definition 3 Let Q = (V, Vq, Vi, E, f2) be an arena and let X,Y CP, such that 
A induces a subarena of Q (i.e., every position in A has a successor in A). The 
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attractor of Player <j of F in X is the set Attr^(F) of those positions v £ X 
from which Player a has a strategy to force the play into Y. More formally 
Attr^(F) = where 

Z° = Xr\Y, 

F“+i = U {?; G y, n A : ni; n yf 0} U {t> G DX ■.vECZ°‘} 

Z^ = \^ Z°‘ for limit ordinals A 

a< A 

On Attr^(F), Player cr has a positional attractor strategy to bring the play 
into Y. Moreover X \ Attr^(F) is again a subarena. 

Theorem 4 Let Q = {V,Vq,V\,E,Q) he game with Streett-Rabin winning con- 
dition {Eq,Ti). Then Q is determined, i.e. V = WqLIWi, with a finite memory 
winning strategy of Player 0 on Wq, and a positional winning strategy of Player 1 
on W\. The size of the memory required by the winning strategy for Player 0 is 
bounded by the number of leaves of the Zielonka tree for (Eo,Ei). 

Proof. We proceed by induction on the number of priorities in C or, equivalently, 
the depth of the Zielonka tree Z{Eo,Ei). Let £ be number of leaves of Z{Eo, Ei). 
We distinguish two cases. 

We first assume that C £ T\. 

Aq := {v : Player 0 has a winning strategy with memory of size < £ from ?;}, 

and Xi = P \ Xq. It suffices to prove that Player 1 has a positional winning 
strategy on Xi. To construct this strategy, we combine three positional strategies 
of Player 1, a trap strategy, an attractor strategy, and a winning strategy on a 
subgame with fewer priorities. 

First, we observe that Xi is a trap for Player 0; this means that Player 1 has 
a positional trap-strategy t on Xi to enforce that the play stays within Xi. 

Since Eg is closed under union, there is a unique maximal subset C' Q C 
with C G Ei). Let Y := Ai fl Q~^{C \ C) and let Z = Attr;f'yF) \ Y . Observe 
that Player 1 has a positional attractor strategy a, by which he forces from any 
position z £ Z that the play reaches Y. 

Finally, let P' = Ai \ (F U Z) and let Q' be the subgame of Q induced by 
P', with winning condition {Eq 0 V(C'),E\ 0 P(C")). Since this game has fewer 
priorities, the induction hypothesis applies, i.e. P' = IPg U IP{, Player 0 has a 
winning strategy with memory < £ on IPq and Player 1 has a positional winning 
strategy g' on W{. However, TPg = 0; otherwise we could combine the strategies 
of Player 0 to obtain a winning strategy with memory < ^ on Xq U IPg A Xq 
contradicting the definition of Xq. Hence IP{ = P'. 

We can now define a positional strategy g for Player 1 on Xi by 

g{x) = 



9'{x) 

a{x) 

t{x) 



if x£V' 
if X £ Z 
if a; G F 
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Consider any play tt that starts at a position v G Xi and is consistent with 
g. Obviously tt stays within Xi. If it hits Y U Z only finitely often, then from 
some point onward, it stays within Vi and coincides with a play consistent with 
g' . It is therefore won by Player 1. Otherwise tt hits Y U Z, and hence also Y, 
infinitely often. Thus, Inf(7r) O (C\ C") yf 0 and therefore Inf(7r) G Ti. 

We now consider the second case, C G Xq. There exist maximal subsets 
Co, .. . , Ck-i C C with Ci G Observe that for every set D C C, we have 
that if I? n (C \ Ci) yf 0 for all i < k, then D G To- Let 

X\ := {u : Player 1 has a positional winning strategy from w}, 

and Xq = V \ Xi . We claim that Player 0 has a finite memory winning strategy 
of size < £ on Xo- To construct this strategy, we proceed in a similar way as 
above, for each of the sets C \ Ci. We will obtain strategies /o, • • • , fk-i for 
Player 0, such that fi has finite memory Mi, and we will use these strategies to 
build a winning strategy / on Xo with memory Mq U • • • U Mk_i- 

For z = 0, . . . , fc— 1, let Yi = Xor\f2~^{C\Ci) let Zi = Attr^°(Fi)\li, and let 
Qi be a positional attractor strategy, by which Player 0 can force a play from any 
position in Zi to Yi. Further, let Ui = Xo\{YiUZi) and let Gi be the subgame of 
G induced by Ui, with winning condition {To C\'P{Ci),Ti C\'P{Ci)). The winning 
region of Player 1 in Gi is empty; indeed, if Player 1 could win Gi from v, then, 
by induction hypothesis, he could win with a positional winning strategy. By 
combining this strategy with the positional winning strategy of Player 1 on X\, 
this would imply that v G Xi] but v G Ui C V \ Xi. 

Hence, by induction hypothesis. Player 0 has a winning strategy fi with finite 
memory Mi on Ui- Let {fi + a^) be the combination of fi with the attractor 
strategy a*. From any position v G UiU Zi this strategy ensures that the play 
either remains inside Ui and is winning for Player 1, or it eventually reaches a 
position in Yi. 

We now combine the finite-memory strategies (/o -I- ao)) • ■ • > (/fe-i + Ofc-i) 
to a winning strategy / on Xo, which ensures that either the play ultimately 
remains within one of the regions Ui and coincides with a play according to fi, 
or that it cycles infinitely often through all the regions Yo, . . . , Y^-i- 

At positions in ni<fc Player 0 just plays with a (positional) trap strategy 
ensuring that the play remains in Xo- At the first position v ^ rii<fe Po Player 0 
takes the minimal i such that v ^ Yi, i.e. v G Ui U Zi, and uses the strategy 
{fi + a*) until a position in w G Yi is reached. At this point. Player 0 switches 
from z to j = i+£ (mod k) for the minimal £ such that w ^ Yj. Hence w G UjUZj; 
Player 0 now plays with strategy {fj + a,j) until a position in Yj is reached. There 
Player 0 again switches to the appropriate next strategy, and so on. 

Assuming that Mi fl Mj = 0 for z yf j it is not difficult to see that / can be 
implemented with memory M = Mo U • • • U We leave a formal definition 

of / to the reader. 

It remains to prove that / is winning on Aq. Let tt be a play that starts in Xq 
and is consistent with /. If tt eventually remains inside some Ui then it coincides, 
from some point onwards, with a play that is consistent with fi, and therefore 
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won by Player 0. Otherwise it hits each of the sets Pq) ■ • • : Pfc-i infinitely often. 
But this means that Inf(7r) O (C \ Q) ^ 0 for all i < k; as observed above this 
implies that Inf(7r) G 

Note that, by induction hypothesis, the size of the memory Mi is bounded 
by the number of leaves of the Zielonka subtrees Z{Tq 0 V{Ci),T\ 0 V{Ci). 
Consequently the size of M is bounded by the number of leaves of Z{Tq, Ti). 

□ 

Of course it also follows from this Theorem that parity games are positionally 
determined. 



3 Path Games 

Another interesting variant of two-player games on graphs are path games where 
in each move a player can select a path of arbitrary finite length rather than just 
an edge. Such games arise in several contexts. 

3.1 Banach- Mazur Games 

In descriptive set theory, path games have been studied in the form of Banach- 
Mazur games (see [9, Chapter 6] or [10, Chapter 8.H]). In their original variant 
(see [15, pp. 113-117], the winning condition is a set W of real numbers; in 
the first move, one of the players selects an interval d\ on the real line, then 
his opponent chooses an interval d2 C di, then the first player selects a further 
refinement d^ C ^2 and so on. The first player wins if the intersection 
of all intervals contains a point of W, otherwise his opponent wins. 

By identifying real numbers with infinite sequences of natural numbers, this 
game is essentially equivalent to a path game on the w-branching tree T“ with 
winning condition IT C a;“. Player 0 starts by selecting a finite path apai . . . am 
from the root, and in each further move, the player extend the path by another 
finite sequence of numbers. The outcome of the play is an infinite path tt through 
Player 0 has won if tt G IT, otherwise Player 1 has won. This game is usually 
denoted G**{W). 

A classical result due to Banach and Mazur characterises, in terms of topo- 
logical properties, the winning conditions IT such that one of the two players has 
a winning strategy for the game G**(1T). We recall that a set A in a topological 
space is nowhere dense if its closure does not contain a non-empty open set. A 
set is meager if it is a union of countably many nowhere dense sets and it has 
the Baire property if its symmetric difference with some open set is meager. In 
particular, every Borel set has the Baire property. 

Theorem 5 (Banach- Mazur) (1) Player 1 has a winning strategy for the 
game G**(1T) if, and only if W is meager. 

(2) Player 0 has a winning strategy for G**(1T) if, and only if, there exists a 
finite word x € u>* such that x ■ \W is meager (i.e., W is co-meager 
in some basic open set). 
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As a consequence, it can be shown that for any class F C P(a;“) that is 
closed under complement and under union with open sets, all games G**(W) 
with W G r are determined if, and only if, all sets in F have the Baire property. 
Since Borel sets have the Baire property, it follows that Banach-Mazur games 
are determined for Borel winning conditions. (Via a coding argument, this can 
also been easily derived form Martin’s Theorem.) 



3.2 Path Games in Computer Science 

Pistore and Vardi [17] use path games as a tool for task planning in nondeter- 
ministic domains. In their scenario, the desired infinite behaviour of a system is 
specified by formulae in LTL, and it is assumed that the outcome of actions may 
be nondeterministic; hence a plan has not only one possible execution path, but 
an execution tree. Between weak planning (some possible execution path satis- 
fies the specification) which is of course not very useful, and strong planning (all 
possible outcomes are consistent with the specification) which is often unrealis- 
tic, there is a spectrum of intermediate cases such as for instance strong cyclic 
planning: every possible partial execution of the plan can be extended to an exe- 
cution reaching the desired goal. In this context, planning can be modelled by a 
game between a friendly player E and a hostile player A selecting the outcomes 
of nondeterministic actions. This game is a path game on the execution tree of 
the plan, and the question is whether the friendly player E has a strategy to 
ensure that the outcome (an infinite path through the execution tree) satisfies 
the given LTL-specification. 

In [I] we have studied path games in a quite different scenario: once upon a 
time in the west, two players set out on an infinite ride. More often than not, 
they had quite different ideas on where to go, but for reasons that have by now 
been forgotten they were forced to stay together - as long as they were both 
alive. They agreed on the rule that each player can determine on every second 
day, where the ride should go. Hence, one of the players began by choosing the 
first day’s ride: he indicated a finite, non-empty path pi from the starting point 
v; on the second day his opponent selected the next stretch of way, extending 
Pi to a finite path piqi; then it was again the turn of the first player to extend 
the path to piqiP 2 and so on. After to days, an infinite ride is completed and 
it is time for payoff. The scenario is quite useful to capture the interest of the 
audience at at conference, and provides good motivation to study general issues 
like positional determinacy, algorithmic complexity and logical definability of 
path games. 

In the Banach-Mazur games the players strictly alternate. But it is 

also interesting to study cases, where after a few alternations, one of the players 
is eliminated, and the games then becomes a solitaire game. In the planning 
application studied in [17] these are in fact the most relevant cases. For instance, 
strong cyclic planning corresponds to what we call an Aif“-game: a single move 
by A is followed by actions of E. In the scenario of the wild west as investigated 
in [1] it is of also quite realistic that one of players may not make to the end 
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(see e.g. [12,13]). To describe the alternation pattern between the players we call 
the players Ego and Alter, and denote a move where Ego selects a finite path 
by E and an w-sequence of such moves by for Alter we use corresponding 
notation A and For any fixed triple G = {G,W,v) consisting of an arena 
G, a winning condition W and an initial position v, we then have the following 
games. 

— (EA)‘^{G) and (AE)‘^{G) are the path games with infinite alternation of 
finite path moves. 

— {EA)^E‘^{G) and A{EA)^E^{G), for arbitrary fc G N, are the games ending 
with an infinite path extension by Ego. 

— {AE)^A^{G) and E{AE)’^ A^ {G) are the games where Alter chooses the final 
infinite lonesome ride. 

This infinite collection of games collapses to a finite lattice of just eight 
different games, a result that has been proved independently in [17] and [1]. 

Theorem 6 For every triple G = (G, IE, v) 

E‘^{G) > EAE^{G) > AE^{G) 

Y| Y| 

{EAYiG) > {AEYiG) 

Y| Y| 

EA^{G) > AEA^iG) h A‘^{G) 

Further, every path game on G is equivalent to one of these eight games. 

For each comparison Y in the diagram there are simple games for which it is 
strict. 



3.3 Positional Determinacy of Path Games 

Path games with only finitely many alternations between the two players are 
trivially determined, for whatever winning condition, and for path games with 
infinite alternations, the Banach-Mazur Theorem establishes determinacy for a 
very large class of winning conditions, covering almost all path games that are 
likely to appear in computer science applications. 

As for the usual graph games, the question arises, which path games admit 
positional winning stratgies. Observe that a positional strategy for a path game 
on the graph G = (V, F) has the form f : V ^ V* assigning to every node v a 
finite path from v through G. 

We first look at path games with infinite alternations. 

Proposition 7 If Ego has a winning strategy for a path game (EA)‘^{G,W,v) 
with W G .^ 2 , then he also has a positional winning strategy. 
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Proof. Let G = {V, F) be the game graph. Since W isa, countable union of closed 
sets, we have W = Un<w[^n] where each T„ C F* is a tree. Further, let / be 
any (non-positional) winning strategy for Ego. We claim that, in fact. Ego can 
win with one move. 

We construct this move by induction. Let xi be the initial path chosen by Ego 
according to /. Let z > 1 and suppose that we have already constructed a finite 
path Xi ^ S finite y, then all infinite plays extending 

Xi remain in W, hence Ego wins with the initial move w = Xi. Otherwise choose 
some yi such that Xiyi ^ Ti, and suppose that Alter prolongs the play from Xi 
to Xiyi- Let Xi+i := f{xiyi) the result of the next move of Ego, according to his 
winning strategy /. 

If this process did not terminate, then it would produce an infinite play that 
is consistent with / and won by Alter. Since / is a winning strategy for Ego, 
this is impossible. Hence there exists some m < uj such that XmV G for all y. 
Thus, if Ego moves to Xm in his opening move, then he wins, no matter how the 
play proceeds afterwards. In particular. Ego wins with a positional strategy. □ 

We cannot extend this observation beyond the J72-level of the Borel hierar- 
chy. Consider the path game on the completely connected directed graph with 
nodes 0 and 1, with a iT 2 -winning condition for Ego, consisting of those infinite 
plays that have infinitely many initial segments containing more ones than ze- 
ros. Clearly, Ego has a winning strategy for (EA)“(G, W, v), but not a positional 
one. 

Muller and SIS winning conditions. We recall that there are very simple 
examples of Muller games that do not admit positional winning strategies. For 
path games the situation is different. 

Proposition 8 All path games {EA)^{Q) with a Muller winning condition ad- 
mit positional winning strategies. 

Proof. We will write v < w to denote that position w is reachable from v in the 
arena G. For every position v, let C{v) be the set of priorities reachable from v, 
that is, G{v) := {l7('u;) : v < w}. Obviously, G{w) C G{v) whenever v < w. We 
call V a stable position if G{w) = G{v) for all w that are reachable from v. Note 
that from every u some stable position is reachable. Further, if v is stable, then 
every reachable position w > v is stable as well. 

Let the set of winning plays in Q be defines by a Muller condition (Fo,Fi). 
We claim that Ego has a winning strategy in (EA)‘^(Q) iff there is a stable 
position V that is reachable from the initial position vq, so that G(v) € Fq. To 
see this, let us assume that there is such a stable position v with C(v) G Fq. 
Then, for every u > v, we choose a path p from u so that, when moving along p, 
each colour of G{u) = G{v) is visited at least once, and set f{u) := p. In case Vg 
is not reachable from v, we assign f(vo) to some path that leads from vq to v. 
Now f is a positional winning strategy for Ego in (EA)‘^(Q), because, after the 
first move, no colours other then those in G(v) are seen. Moreover, every colour 
in C(v) is visited at each move of Ego, hence, infinitely often. 
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Conversely, if for every stable position v reachable from vq we have C{v) G 
then we can construct a winning strategy for Alter in a similar way. □ 

Note that in a finite arena all positions of a strongly connected component 
that is terminal, i.e., with no outgoing edges, are stable. Thus, the above char- 
acterisation translates as follows: Ego wins the game iff there is a terminal com- 
ponent whose set of colours belongs to J^o- Obviously this can be established 
in linear time w.r.t. the size of the arena and the description of the winning 
condition. 

Corollary 9 On a finite arena G, path games with a Muller winning condition 
(lFo,iFi) can he solved in time 0(|G| • |1^ct|)- 

In fact, this result can be extended to a very large class of path games, 
with general winning conditions definable in SIS. For path games with finite 
alternations there are, however, also some cases where memory is required. The 
situation is summarized by the following result. 

Theorem 10 Let 7 (G, Lp) denote the path game on the arena G with alternation 
pattern 7 and winning condition defined by the SlS-formula ip. 

(1) All SIS path games 7(G, <^) are determined via finite memory strategies. 

(2) All SIS path games (EA)‘^(G,ip) and {AE)‘^{G,(p) are positionally deter- 
mined. 

(3) For future conditions <p G SIS, all path games j{G,(p) are positionally 
determined. 

(4) For any game prefix 7 with finite alternations there exist games 7(G, ip) 
with (p G SIS that do not admit positional winning strategies. 

Here, a future condition is a formula that does not depend on initial segments: 
for any w-word tt and any pair of finite words x and y, we have xtt |= 'ip if, 
and only if, yn \= ip. We just sketch the proof. First of all, it is not difficult to 
prove that path games with parity winning condition are positionally determined 
for any game prefix. (By Theorem 6 it suffices to consider the eight prefixes 
EAE^^, AEA^, (FA)“, and {AEY .) 

One can then use parity games as an instrument to investigate path games 
with winning conditions specified in by arbitrary SlS-formulae. It is well known 
that every SlS-formula is equivalent to a deterministic parity automaton A 
(see e.g. [6]). To prove (1), we analyse a path game on G with winning condition 
ip by considering two games on the product arena GxA, one, denoted 'H[G] with 
the priority labelling inherited from G and winning condition (p, the other one, 
denoted TilA] with priorities inherited from A and the parity winning condition. 

— A play through G x A is winning for Ego in 'H[G] if and only if it winning 

for Ego in HlA]. 

— The two arenas G and 'H[G] are bisimilar. 
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The positional determinacy of parity path games then implies positional deter- 
minacy of 'H[G\ which in turn implies obtains finite memory determinacy for the 
original game; the value f{v,q) of a winning strategy depends on the current 
position w in G and a state q of the automaton A. 

To prove (2) we have to unify, for each position v € G, the values f{v,q) for 
those pairs {v, q) that are reachable in a play of according to /. Let us assume 
that Ego wins the game {EA)‘^{G,(p) starting from position vq- We will base 
our argumentation on the assiciated game TL[G] for which Ego has a positional 
winning strategy /. 

For any v, we denote by Qf{v) the set of states q so that the position {v, q) 
can be reached from position (vg, qo) in a play according to /. Let {gi, q 2 , . ■ . , (/„} 
be an enumeration of Qf{v), in which the initial state qo is taken first, in case it 
belongs to Qf (v) . We construct a path associated to v along the following steps. 
First, set pi := f{v,qi)] for 1 < i < n, let {v',q') be the node reached after 

playing the path pi • P 2 Pi-i from position (v,qi) and set pi := f{v' ,q'). 

Finally, let f{v) be the concatenation of Pi,P 2 , ■ ■ ■ ,Pn- 
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Fig. 1. Merging strategies at node v 



Now, consider a play in TL[G] in which Ego chooses the path f'{v) at any 
node (v,q) £ V x Q. This way, the play will start with f{qo,vo). Further, at 
any position (y, q) at which Ego moves, f'{v) contains some segment of the form 
(v' , q') ■ f{v', q'). In other words, every move of Ego has some “good part” which 
would also have been produced by / at the position {v',q'). But this means 
that the play coincides with a play where Ego always moves according to the 
strategy / while all the “bad parts” were produced by Alter. This proves that 
/' is a positional strategy for Ego in the game "H[G]. Since the values do not 
depend on the second component, /' induces a positional strategy for Ego in 
(EA)“(G, tp). The same construction works for the case {AE)^{G, pi), if we take 
instead of Qf{v) the set Q{v) := {<5((7o, s) : s is a path from r;o to v}. 

The argument for (2) relies on players always taking turns. If we con- 
sider games where the players alternate only finitely many times, the situa- 
tion changes. Consider, for instance, the winning condition p) £ SIS that re- 
quires the number of zeroes occurring in a play to be odd on the completely 
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connected arenea with two positions 0,1. When starting from position 1, Ego 
obviously has winning strategies for each of the games AE^{G,tp), 

and E AE^^ {G , 'tp) , but no positional ones. Nevertheless, such games are always 
positionally determined for one of the players. Indeed, if a player wins a game 
j{G,tp) finally controlled by his opponent, he always has a positional winning 
strategy. This is trivial when 7 G AE‘^, for the remaining cases 

EAE'^ and AEA^ a positional strategy can be constructed as above. For a proof 
of (3) the reader is refered to [1]. 

4 Games with Infinitely Many Priorities 

Muller games and parity games have been studied extensively both for the cases 
of finite and for infinite game graphs. However, even in the case of infinite game 
graphs, it has always been assumed that the positions are labelled with a finite 
set of priorities, and this is essential for the proofs of positional determinacy 
exhibitted above, that proceed by induction on the number of priorities or on 
the depth of the Zielonka tree. 

We find it interesting to generalise the theory of infinite games to the case 
of infinitely many priorities. Besides the theoretical interest, such games arise 
in several contexts. For instance, pushdown games with winning conditions de- 
pending on stack contents as considered in [2,3] can be viewed as special cases of 
Muller or parity games with infinitely many priorities. We have started to and 
report here on some of the results. For more information the reader is referred 
to [7]. 

The definition of Muller games (Definition 1) directly generalises to count- 
able sets G of priorities^. However, a representation of a Muller condition by a 
Zielonka tree is not always possible, since we may have sets D G that have 
subsets in T\-a but no maximal ones. Further, it turns out that the condition 
that Tq and T\ are both closed under finite unions is no longer sufficient for 
positional determinacy. To see this let us discuss the possible generalisations of 
parity games to the case of priority assigments 17 : F — >■ w. For parity games 
with finitely many priorities it is of course purely a matter of taste whether we 
let the winner be determined by the least priority seen infinitely often or by the 
greatest one. Here this is no longer the case. 

Parity games are games where Player 0 wins the plays in which the least 
priority seen infinitely often is even, or where no priority appears infinitely 
often. Thus, 



Tq = {X C w : min(7f) is even} U {0} 

T\ = {X C io : min(7f) is odd} 

^ With minor modifications, it can also be generalised to uncountable sets C. See [7] 
for a discussion of this. 
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Max-parity games are games where Player 0 wins if the maximal priority 
occurring infinitely often is even, or does not exist, i.e. 

Tq = {X Q (V : if X is finite and non-empty, then max(X) is even} 

= {X C a; : X is finite, non-empty, and max(X) is odd} 

Note that for both definitions, Xq and iFi are closed under finite unions. 
Nevertheless the two conditions behave quite differently. The parity condition 
has a very simple Zielonka tree, namely just a Zielonka path 

w ^ o;\{0} ^ u;\{0,l} ^ o.\{0,l,2} ^ ••• 

whereas there is no Zielonka tree for the max-parity condition since oj & Tq has 
no maximal subset in (and is not closed under unions of chains). This is in 
fact related to a much more important difference concerning the memory needed 
for winning strategies. Indeed, consider the max-parity game with positions Vq = 
{0} and Vi = {2n-|-l:nGN} (where the name of a position is also its priority), 
such that Player 0 can move from 0 to any position 2n -I- 1 and Player 1 can 
move back from 2n -I- 1 to 0. Clearly Player 0 has a winning strategy from each 
position but no winning strategy with finite memory. 

Hence positional determinacy, and even finite-memory determinacy fails for 
max-parity games with infinitely many priorities. On the other hand we prove in 
[7] that parity games with priorities in oj do admit positional winning strategies 
for both players. In fact, parity games over u> turn out to be the only Muller 
games with this property. 

Theorem 11 A Muller condition {Tq,Ti) over a countable set C of priorities 
admits positional winning strategies if, and only if, it is isomorphic to a parity 
condition over n < to priorities. 

This discrepancy between (min-)parity games and max-parity games has an 
interesting application to a classical problem posed in [18]. The curious reader 
is refered to [7], 
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Abstract. We initiate the theory of communication complexity of indi- 
vidual inputs held by the agents, rather than worst-case or average-case. 
We consider total, partial, and partially correct protocols, one-way versus 
two-way, with (not in this version) and without help bits. 



1 Introduction 

Suppose Alice has input x, Bob has input y, and they want to compute a func- 
tion f(x, y) by communicating information and local computation according to 
a fixed protocol P = (Pa,Pb)- Here Pa is the protocol executed by Alice, and 
Pb is the protocol executed by Bob. To be more precise, let us assume that 
Alice outputs f{x,y). We are only interested in minimizing the number of bits 
communicated between Alice and Bob as is customary in the communication 
complexity setting [8,3]. Usually, one considers the worst-case or average-case 
over all inputs x,y oi given length n. However, in current situations like repli- 
cated file systems, and cache coherence algorithms, in multiprocessor systems 
and computer networks, the worst-case or average-case are not necessarily sig- 
nificant. The files or updates can be very large; but in real life they may typically 
be non-random and have considerable regularities or correlations that allow the 
communicated information to be greatly compressible. Neither the worst-case 
nor the average-case may be relevant; one wants to analyze the individual case. 
This gives also much more information: from the individual case-analysis one 
can easily derive the worst-case and the average-case, but not the other way 
around. Indeed, certain phenomena have no counterpart in more traditional set- 
tings: For example, there are inputs for Bob such that irrespective of Alice’s 
input, every “simple” total protocol requires arbitrarily higher communication 
complexity than some more “complex” total protocol. Our results are expressed 
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in terms of Kolmogorov complexity [5] , the minimal number of bits from which 
the data can be decompressed by effective computation. We use the “plain” 
Kolmogorov complexity denoted as C{x),C{x\y),C{x,y) for the absolute com- 
plexity of X, the conditional complexity of x given y, and the joint complexity 
of X, y. Increased compression of the data approximates the Kolmogorov com- 
plexity more and more, but the actual value is uncomputable in general. Given 
X, y, and assuming that Alice and Bob have a protocol P that works correctly 
on x,y, we study the individual communication complexity CC^{x,y) defined 
as the number of bits Alice with input x and Bob with input y exchange using 
protocol P. We refer to a standard definition of communication protocol [3]. We 
assume that the protocol identifies the length n of the strings on which it works. 
By the complexity of a protocol P we mean its plain Kolmogorov complexity 
conditional to n, denoted as C{P\n). 

Results and Related Work: We use the framework of communication 
complexity as in [8,3]. As far as we are aware there is no previous work on 
individual communication complexity. We formulate a theory of individual com- 
munication complexity, and first analyze the ’’mother” problem, the indentity 
function, where Alice outputs the input of Bob. We look at special functions 
such as the inner product, random functions, and the equality function. We 
then turn to the question of analyzing the communication complexity, with re- 
spect to the best protocol of given complexity, for the mother problem (identity 
function). For total protocols that are always correct, the power of one-way 
protocols equals that of two-way protocols, but for partially correct protocols 
or partial protocols, two-way protocols are remarkably more powerful. We es- 
tablish a relation with Kolmogorov’s Structure function, and the existence of 
strange “non-communicable” inputs of possibly low Kolmogorov complexity for 
total protocols — for which the communication complexity of every total proto- 
col is necessarily very large (almost the literal uncompressed input needs to be 
communicated) unless all of the input is hard-wired in the protocol. It is shown 
that for partial protocols two-way is more powerful than one-way when we use 
help bits (omitted in this extended abstract for space reasons) . 

2 The Mother Function: Identity 

We start with listing some easy facts that establish lower and upper bounds 
on individual communication complexity with respect to individual protocols 
P expressed in terms of C{y\n), C{y\P) and compared to C{y\x). We assume 
that the protocols do not depend on x, y, they are uniform, and they compute 
the function concerned on strings of length n. Let C be a constant such that 
C{y\n) <n + Cfor all y. 

Let I{x,y) = y he the identity function: Alice with input x and Bob with 
input y compute output y by Alice. This is the “mother” function: for if Alice 
can compute I then she can compute every computable function /. 

(1) For all n there is a protocol P of complexity n -I- 0(1) to compute the 
identity function such that for all x, y of length n we have CCf {x, y) < C{y\n). 
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Indeed, assume Bob knows = \{p \ \p\ < n + C, U{p) halts}|. {U is the 
reference universal Turing machine.) Then Bob can find all halting programs of 
length at most n+C by enumerating them until he obtains halting programs. 
This allows him to find a shortest program y* for y. He transmits that program 
to Alice and Alice computes y. The complexity of this protocol is C'(L„)+0(1) = 
n + 0(1). 

(2) The complexity bound n + 0(1) on 0(P|n) in item (1) is tight. For 
every protocol of complexity less than n the assertion of item (1) is false: for 
all P there are x,y such that CCf{x,y) > n but C{y\P) = 0(1) (and hence 
C{y\n) < C{P\n) + 0(1), that is C{y\n) is much smaller than CCf{x,y) if 
C{P\n) is much smaller than n). 

Indeed, let e be the empty string and let y be the first string such that 
CCf {e,y) > n (by counting arguments there is such y). 

(3) For every protocol P to compute identity function and for every x, y we 
have CCf{x,y) > C{y\P) - 0(1). 

Let c be the conversation between Alice and Bob on inputs x, y. It suffices to 
prove that given P, c we can find y. It is known [3] that the set of all pairs (x', y') 
such that the conversation between Alice and Bob on input {x',y') is equal to c 
is a rectangle, that is, has the form X x Y, for some X,Y C {0, 1}". The set Y 
is a one-element set, as for every y' £ Y Alice outputs y also on the input (x, y') 
(the output of Alice depends on c, P, x only) . We can find Y given P, c and since 
Y = {y} we are done. 

By item (2), for every protocol there are x, y such that the right hand side of 
the inequality CCf{x,y) > C{y\P) — 0(1) is much less than its left hand side, 
more specifically, C{y\P) = 0(1) and CCf{x,y) > n. 

(4) How is CCf{x,y) related to 0(y|x)? By item (3) we have CCf{x,y) > 
C{y\x) — C{P) — 0(logO(P)) for all x,y. For all P this inequality is not tight 
for some x,y: there are x,y such that C{y\x) = 0(1) but CCf{x,y) > n. 
Indeed, let x = y. We need to prove that for some x it holds CCf (x, x) > n. For 
every x let c(x) denote the conversation on the pair (x,x). For every x the set 
of input pairs {x' ,y') producing the conversation c(x) is a rectangle of height 1, 
as we have seen in item (3). Therefore c(x) are pairwise different for different x 
hence for some x we have |c(x)| > n. 

(5) However, for some P, x, y the inequality CCf (x, y) > C{y\x) — C{P\n) — 
0(logO(P|n)) is close to an equality: for all a there are P,x,y such that 
CCf{x,y) =C{y\x)-a + 0{l) and C(P|n) <a + 0{l). 

Indeed, let x be some string. Let y be a random string of length n, independent 
of X, that is, C{y\x) = n + 0{l). Let P be the following protocol: Bob first looks 
whether his string y' has the same prefix of length a than y. If this is the case 
he sends to Alice 0 and then n — a remaining bits of y' and Alice prefixes the 
n — a received bits by a first bits of y and outputs the resulting string. Otherwise 
Bob sends to Alice 1 and then y' . The complexity of this protocol is at most 
a -I- 0(1), as both Alice and Bob need to know only the first a bits of y. And we 
have CCf (x,y) = n — a = C{y\x) — a + 0(1). 
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3 Other Functions 

Because Alice can compute every computable function once she knows Bob’s 
input, for all f,P there is P' with CC^'{x,y) < CCf{x,y) and C(P') < 
C(P,/) + 0(l). 

The trivial lower bound on the individual communication complexity of a 
function / is CC^{x,y) > C{f{x,y) \ x,P) — 0(1) (and hence TCCf{x,y) > 
C{f{x,y) I x) — a — O(loga) anticipating on a later defined notion). In this 
section we establish some nontrivial lower bounds on CC^{x, y) for P computing 
/ on all arguments for the inner product function, the equality function and for 
random Boolean functions. We omit the proofs for space reasons. 

Initially, Alice has a string x = Xi,... ,x„ and Bob has a string y = 
j/i,... with x,y € {0,1}". Alice and Bob compute the inner product of 
X and y modulo 2 



n 

f{x,y) = ^Xi-yi mod 2 

i=l 

with Alice ending up with the result. The following result is proven by extending 
an argument introduced in [2]. 

Theorem 1. Every deterministic protocol P computing the inner product func- 
tion f requires at least CC^{x,y) > C{x,y \ P) — n — 0{l) bits of communication 
on all X, y. 



Remark 1. The result of the theorem is only significant for C{x,y) > n, but 
for some x,y it cannot be improved. Namely, if a: = 00 . . . 0 then f{x,y) = 0 
for all y’s and there is a protocol P computing the Identity function such that 
CC^{x, y) = 0 for all such x, y. If y is any random string (relative to P) then the 
right hand side of the inequality CC^{x,y) > C{x,y \ P) — n — 0(1) becomes 
0(1) while the left hand side is equal to 0, thus both sides are almost the same. 

Assume Alice has x = x\ . . .Xn and Bob has yi . . . j/„, and / : (0, 1}^" — >■ 
(0, 1} satisfies 



C{f I n) > 22" - n. (1) 

The latter condition means that the truth table describing the outcomes of / for 
the 2" possible inputs x (the rows) and the 2" possible inputs for y (the columns) 
has high Kolmogorov complexity. If we flip the truth table for a prospective / 
using a fair coin, then with probability at least 1 — 2“" it will satisfy (1). 

Theorem 2. Every deterministic protocol P computing a function f satisfying 
(1) (without help bits) requires at least CCj{x,y) > min{0(a; | P),C{y \ P)} — 
logn — 0(1). 
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Theorem 3. Let f be the equality function, with f{x,y) = 1 if x = y and 0 
otherwise. For every deterministic protocol P computing f we have CC^{x, x) > 
C{x I P) — 0(1) for all x,y. On the other hand there is P of complexity 0(1) 
such that there are x,y (x ^ y) with C{x \ P),C{y | P) > n — 1 for which 
CCf{x,y)=2. 

Generalizing this idea, every function that contains large monochromatic rect- 
angles, of size say has many pairs x,y of complexity close to n for 

which the individual communication complexity drops to O(logn): In round 1 
Bob tells Alice in which large rectangle (if any) his input is situated, by sending 
the index of the rectangle to Alice, and 0 otherwise. If Bob did send an index, 
and Alice’s input is in that rectangle as well, then Alice outputs the color (“0” 
or “1”) of the rectangle. Otherwise, Alice starts a default protocol. 

4 Total Protocols 

Let / be a function defined on pairs of strings of the same length. Assume that 
Alice has x, Bob has y and Alice wants to compute f{x,y). As the complexity 
measure we consider the number of bits communicated between Alice and Bob. 
The naive definition of the individual communication complexity of the value of 
the function / on the argument (x, y) is the number of communicated bits in the 
“best” communication protocol. Then, for every x, y there is a protocol with no 
communication at all on (x,y): the string y is hard wired into the protocol. To 
meaningfully capture the individual communication complexity of computing a 
function f(x, y) we define now the following notion. 

Definition 1. Let a be a natural number parameter. Let TCCj{x,y) stand 
for the minimum CC^{x,y) over all total protocols P of complexity at most a 
that always compute / correctly (being total such a protocol terminates for all 
inputs, and not only for (x,y)). 

For a = n + 0(1) we have TCCJ{x, y) = 0 for all computable / and all x, y, 
since we can hard wire y into the protocol. Therefore it is natural to consider 
only a that are much smaller than n, say a = O(logn). Since computation 
of the Identity function suffices to compute all other (recursive) functions we 
have TCC‘^^^'"^\x,y) < TCCf{x,y). The trivial lower bound is TCCJ{x,y) > 
C{f{x, y) \ x) — a — 0(logo;). For f = I this gives TCCf{x, y) > C{y \ x) — a — 
0(log a). 



4.1 One-Way Equals Two-Way for Identity 

Let TCCj ^_^^^{x,y) stand for the minimum TCC^{x,y) over all one-way (Bob 
sends a message to Alice) total protocols P of complexity at most a computing / 
(over all inputs, and not only on {x, y)). It is clear that TCCj y) does not 

depend on x\ indeed, consider for given {x, y) the best protocol P; that protocol 
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sends the same message on every other pair (x',y) hence ,y) < 

TCCf -^_^^y{x,y). Obviously, 

TCCf{x,y)<TCCl,_^,^(x,y) 

for all a, x, y, f. 

Surprisingly, for f = I, the Identity function, this inequality is an equality. 
That is, for total protocols “1-way” is as powerful as “2- way.” More specifically, 
the following holds. 

Theorem 4. There is a constant C such that for all a, x, y we have 
TCCf+Zayi^^y) <TCCf{x,y). 

Proof Pick a two-way protocol P witnessing TCCf{x,y) = 1. Let c = c{x,y) 
be the conversation according to P between Alice and Bob on inputs x, y. It is 
known that the set of all pairs {x',y') such that the conversation between Alice 
and Bob on input (x', y') is equal to c is a rectangle, that is, has the form X xY, 
for some X,Y C {0,1}". The set T is a one-element set, as for every y' G Y 
Alice outputs y also on the input (x, y') (the output of Alice depends on c, P, x 
only). 

Consider the following 1-way protocol P': find an x' with minimum c{x' ,y) 
and send c{x',y) to Alice. Alice then finds the set of all pairs {x",y') such that 
the conversation between Alice and Bob on input (x", y') is equal to c(x', y). As 
we have seen that set has the form X x {y} for some X. Thus Alice knows y. 
As |c(x',y)| < |c(x,y)| = TCCf{x,y) and C{P'\P) = 0(1) we are done. 

4.2 Non-communicable Objects 

The function TCCf ^_^^^{x,y), as a function of y,a, essentially coincides with 
Kolmogorov structure function hy{a) studied in [7]. The latter is defined by 

hy{a) = nnn{log [S'] : S' 9 y, 0(S) < a}, 

where S is a finite set and 0(S) is the length (number of bits) in the shortest 
binary program from which the reference universal machine U computes a listing 
of the elements of S and then halts. More specifically we have 

TCC‘ifJ^^{x,y)<hy{a), (2) 

hy{a + O(logn)) < rOC'“i_^,,y(x, y). 

To prove the first inequality we have to transform a finite set S ^ y into 
a one-way protocol P of complexity at most a = C{S) + 0(1) witnessing 
TCCf ^_^^^{x,y) < log|S|. The protocol just sends the index of y in S, or y 
literally if y ^ S. 

To prove the second inequality we have to transform a one-way total protocol 
P into a finite set S 9 y of complexity at most C{P) + O(logn) with log |S| < 
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CC^{y). The set consists of all y' on which P sends the message of the same 
length I as the length of the message on y. Obviously, [S'! < 2^ = 2^'^ and 
to specify S we need a program describing P and 1. Thus C{S) < C{P) + 
0(log TCCl,_^,^{x, y)) < C{P) + 0(log n) . 

For the properties of hy(a), which by Theorem 4 are also properties of 
TCCf y), its relation with Kolmogorov complexity C{y) of y and possible 

shapes of the function a ^ hy{a) we refer to [7]. 

We will present here only a few properties. First, two easy inequalities: For 
all a > 0(1) and all x, y we have 

C{y\n) - a- O(loga) < rOC'“i_^a,y(a;, y) < n- a + 0(1). (3) 

The first inequality is the direct consequence of the inequality C{y\n) < 
CC^{x,y) +C{P\n) + 0{logC{P\n)), which is trivial. To prove the second one 
consider the protocol that sends n — a + C bits of y (for appropriate constant 
O) and the remaining a bits are hardwired into the protocol. Its complexity is 
at most a — C + 0{1) < a for appropriate choice of C. 

The second property is not so easy. Given y, consider values of a such that 

TCC,“i._y(x, y) + a = C{y) + 0(1). (4) 

That is, the protocol P witnessing (4) together with the one-way communication 
record Bob sends to Alice form a two-part code for y that is — up to an inde- 
pendent additive constant — as concise as the shortest one-part code for y (that 
has length C{y) by definition). Following the usage in [7] we call P a “sufficient” 
protocol for y. The description of the protocol plus the communication precisely 
describe y, and in fact, it can be shown that the converse holds as well (up to a 
constant additive term). There always exists such a protocol, since the protocol 
that contains y hard wired in the form of a shortest program of length C{y) 
satisfies the equality with a = C{y) + 0(1) and TCCf{y) = 0. By definition we 
cannot have TCCf y)+a < C{y) — 0(1), but for a sufficiently small we 
have TCCf{y) + a> C{y) + 0(1). In fact, for every form of function satisfying 
the obvious constraints on TCCf there is a y such that TCCf i_^^^{x, y) realizes 
that function up to logarithmic precision. This shows that there are essentially 
non-communicable strings. More precisely: 

Theorem 5. For every k < n and for every function h{a) on integer domain 
[0,/c] with h{0) = n, h{k) = 0, C{h) = O(logn) such that h{a) + a does not 
increase there is a string y of length n and C{y) = k + O(logn) such that 

hia + O(logn)) < TCCli.^^^ix.y). 

The proof is by combining Theorem 1 of [7] with (2). In particular, for ev- 
ery k < n — O(logn) there are strings y of length n and complexity k such 
that TCCfi_^g^y{x,y) > n — a — O(logn) for all a < k — O(logn) while 
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TCCfi_^g^y{x,y) = 0(1) for a > k + 0(1). We call such strings y non- 
communicable. For example, with k = (logn)^ this shows that there are y 
of complexity C{y) « (logn)^ with TCCf^_^^^{x,y) fa n — (logn)^ for all 
a < C{y) — O(logn) and 0(1) otherwise. That is, Bob can hold a highly com- 
pressible string y, but cannot use that fact to reduce the communication com- 
plexity significantly below |y|! Unless all information about y is hard wired in 
the (total) protocol the communication between Bob and Alice requires sending 
y almost completely literally. For such y, irrespective of x, the communication 
complexity is exponential in the complexity of y for all protocols of complexity 
less than that of y; when the complexity of the protocol is allowed to pass the 
complexity of y then the communication complexity suddenly drops to 0. 

Corollary 1. For every n,k with k < n there are y of length n and C{y) = 
k O(logn) such that for every x TCCf ^_y^^y{x,y) > n — a — O(logn) for 
a < C{y) — O(logn); while for every x,y we have TCCf ^ ~ 

a>C{y) + 0{l). 

This follows by combining Theorems 4, 5. If we relax the requirement of total 
and correct protocols to partial and partially correct protocols then we obtain 
the significantly weaker statements of Theorem 6 and Corollary 2. 

5 Partially Correct and Partial Protocols 

The individual communication complexity can decrease if we do not require 
the communication protocol to be correct on all the input pairs. Let CCJ{x,y) 
stand for the minimum CC^{x, y) over all P of complexity at most a computing 
/ correctly on input (x,y) (on other inputs P may output incorrect result). 
The minimum of the empty set is defined as oo. Let CCJ ^_^^y{x,y) stand for 
the minimum CC^{x,y) over all one-way (Bob sends a message to Alice) P of 
complexity at most a computing f{x,y) (again, on other inputs P may work 
incorrectly). For instance, if / is a Boolean function then ^ 

for all X, y (either the protocol outputting always 0 or the protocol outputting 
always 1 is computes f{x,y) for specific pair (x,y)). 



5.1 Partially Correct and Partial Protocols versus Total Ones 

It is easy to see that in computing the Identity function for some (x, y) total, 
partially correct, protocols are more powerful than totally correct ones. A total 
partially correct protocol P computes f{x,y) correctly for some {x,y), but may 
err on some inputs ( m , v), in which case we set CC^{x, y) = oo. Being total such 
a protocol terminates for all inputs. 

Definition 2. Let a be a natural number parameter. Let CCJ{x,y) stand for 
the minimum CC^ {x, y) over all total partially correct protocols P of complexity 
at most a. 
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For instance, for every n there is a total protocol P = Pn computable from 
n such that CCf = 0 (Alice outputs her string), thus CCf^^\x,x) = 

0. On the other hand, for random x of length n we have TCCf (x,x) > 
x) > C{x\n) - a- O(loga) >n-a- O(loga). 

We also consider partial protocols that on some x, y are allowed to get stuck, 
that is, give no instructions at all about how to proceed. Formally, such a protocol 
is a pair of programs {Pa, Pb)- The program Pa tells Alice what to do for each 
c (the current part of the conversation) and x: either wait the next bit from 
Bob, or to send a specific bit to Bob, or to output a certain string and halt. 
Similarly, the program Pg tells Bob what to do for each c and y. either to wait 
the next bit from Alice or to send a bit to Alice, or to halt. This pair must satisfy 
the following requirements for all {x,y) G {0, 1}" and all c: if a party gets the 
instruction to send a bit then another party gets the instruction to wait for a bit. 
However we do not require that for all (x, y) G {0, 1}” and all c both parties get 
some instruction, it is allowed that Pa,Pb start some endless computation. In 
particular, Alice may wait for a bit and at the same time Bob has no instruction 
at all. 

Definition 3. The complexity of a partial protocol P = {Pa,Pb) is defined as 
C{P\n). We say that P computes / on the input {x,y) if Alice outputs f{x,y) 
when Pa,Pb are run on {x,y). On other pairs Alice is allowed to output a 
wrong answer or not output anything at all. If protocol P does not terminate, or 
gives an incorrect answer, for input {x,y), then CC^ (x,y) = oo. Two-way and 
one-way individual communication complexities with complexity of the partial 
protocol upper bounded by a are denoted as PCCJ{x,y) and PCCJ ^_^^^{x,y) 
respectively. 

Since the total, partially correct, protocols are a subset of the partial protocols, 
we always have PCCJ{x,y) < CCj{x,y) < TCCJ{x,y). Consider again the 
Identity function. We have the following obvious lower bound 

C{y\x)-a-0{\oga)<PCCf{x,y) (5) 

for all a,x,y. On the other hand we have the following upper bound if a is at 
least log C{y) + 0(1): 



PCCl^.^,^{x,y)<C{y). (6) 

Indeed, we hardwire the value C{y) in the protocol using logO(y) bits. This 
enables Pg to find a shortest description y* of y and to send it to Alice; subse- 
quently Pa decompresses the message received from Bob. Note that the program 
Pb gives no instruction to Bob if the complexity of Bob’s input is greater than 
C{y). Therefore, this protocol is not total. Comparing Equation (6) to Equa- 
tion (3) we see that for PCC we have a better upper bound than for TCC. It 
turns out that for some pairs {x, y) the communication complexity for totally 
correct (and even for partially correct) protocols is close to the upper bound 
n — a while the communication complexity for partial protocols is close to the 
lower bound C{y\x) — a <C n. 
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Theorem 6. For all a, n, x there are y of length n such that CCfix, y) > n — a 
and C{y\x) < a + 0(l). 

Proof. Fix a string x. By counting arguments, there is a string y with 
CCf{x, y) > n — a. Indeed, there are less than 2“+^ total protocols of complexity 
at most a. For each total protocol P there are at most 2"“““^ different y’s with 
CC^{x,y) < n — a. Therefore the total number of y’s with CCf{x,y) < n — a 
is less than = 2”. 

Let y be the first string with CCf{x, y) > n — a. To identify y conditional to 
X we only need to know the number of total protocols of complexity at most a: 
given that number we enumerate all such protocols until we find all them. Given 
all those protocols and x we run all of them on all pairs {x, y) to find CCf{x, y) 
(here we use that the protocols are total) for every y, and determine the first y 
for which it is at least n — a. Hence C{y\x) < a + 0(1). 



Corollary 2. Fix constants C\,C 2 such that (x,y) < C{y) < 

n + O 2 . Applying the theorem to the empty string e and to (say) a = 21ogn we 
obtain a y of length n with exponential gap between OOj ^°®"(e, y) > n — 21ogn — 
0(1) and PCC\°^^^+^\e,y) < C{y) < 21ogn + 0(1). 

Using a deep result of An. Muchnik [6] we can prove that PCCf i_^^^y is close 
to C{y\x) for a > O(logn) . 

Theorem 7 (An. Muchnik). For all x,y of length n there is p such that 
\p\ < C{y\P) + O(logn), C{p\y) = O(logn) and C{y\p,x) = O(logn), where the 
constants in O(logn) do not depend on n,x,y. 



Corollary 3. For all x,y of length n we have PCCf^^°^))y{x,y) < C{y\x) + 
O(logn). 

Proof. Let p be the program of Muchnik’s theorem, let q be the program of 
length O(logn) for the reference computer to reconstruct p from y and let r the 
program of length O(logn) for the reference computer to reconstruct y from the 
pair (x,p). The protocol is as follows: Bob finds p from y, q and sends p to Alice; 
Alice reconstructs y from x, r. Both q and r are hardwired into the protocol, so 
its complexity is O(logn). This protocol is partial, as both Bob and Alice may 
be stuck when reconstructing p from y', q and y from x',r. 

For very small values of C(y|a;), C(y) we can do even better using the coloring 
lemma 3.9 and theorem 3.11 from [1]. 

Lemma 1. Let fci,/c 2 be such that C(x) < ki and C{y \ x) < ^ 2 , and let 
m = |{(a;,y) : C{x) < fci, C{y \ x) < ^ 2 }!. For M = 2^^ , N = 2 ^^ every 
1 < i? < A Bob can compute the recursive function i?(fci, fe, m, y) = Vy < 
{N/B)e{MNy^^ such that Alice can reconstruct y from x,ry,m and at most 
b < log B extra bits. 
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Using fci, k 2 , m, y, Bob can compute Vy and send it in logr^ bits to Alice. The 
latter computes y from x,m,ry using additionally b < log B special bits provided 
by the protocol. Then, the number of bits that need to be communicated, 1 
round, from Bob to Alice, is 

log Ty <k 2 - log B + 

JD 

The protocol P = {Pa, Pb) uses < 2{ki + k 2 ) + b+ 0(1) bits. 

Corollary 4. If C{x),C{y\x) = O(loglogn) and b = O(loglogn) then 
< C{y\x) - O(loglogn). 

5.2 Two-Way Is Better than One-Way for Partially Correct 
Protocols 

Note that for the Identity function all our upper bounds hold for one-way proto- 
cols and all our lower bounds hold for two-way protocols. The following question 
arises: are two-way protocols more powerful than one-way ones (to compute the 
Identity function)? Theorem 4 implies that for total protocol it does not matter 
whether the communication is one-way or two-way. For partially correct total 
protocols and partial protocol the situation is different. It turns out that par- 
tially correct total two-way protocols are stronger than even partial one-way 
protocols. 

Theorem 8. For every k,l,s such that k > s + 12'^ there are strings x,y of 
length (2^* -|- l)k such that CCf^^\x,y) < 2®log(2/c) but PCCf ^_^g^y{x,y) > 1. 

Proof. We let x = zqZi . . . Z 2 ^ where zq, . . . , Z 2 « have length k and y = ZjQQ ... 0 
for some j. 

To prove the upper bound consider the following two-way protocol: Alice 
finds a set of indexes I = {ii, . . . ,i 2 ^} such that for every distinct j, m there is 
i G / such that tth bit of Zj is different from tth bit of Zm (such set does exist, 
which may be shown by induction). Then she sends to Bob the string i\ . . .i 2 ‘ 
and Bob sends to Alice tth bit of y for all i € I. Alice knows now y. 

We need to find now particular zq, z\, . . . , Z 2 « such that no one-way protocol 
is effective on the pair (x, y) obtained from them in the specified way. To this 
end let P\,. . . , Pn be all the one-way partial protocols of complexity less than 
s computing the identity function. For every z and i < N let c(z, i) denote the 
message sent by Bob in protocol Pi when his input is zOO ... 0 provided the 
length of the message is less than 1. Otherwise let c{z,i) = oo. Let c(z) stand 
for the concatenation of c(z,z) over all i. The range of c(z) has {2^)^ < 2*^ 
elements. Hence there is c such that for more than 2^“^ * > 2“ different z’s we 
have c(z) = c. Pick such c and pick different zq, z\, . . . , Z 2 ‘ among those z’s. 
Let yj stand for the string obtained from Zj by appending Os. We claim that 
CCp(x, yj) > I for some j for all i < N. Assume that this is not the case. That 
is, for every j there are i such that CCp {x, yj) < 1. There are ji yf j 2 for which 
i is the same. As c(zj^,i) = c{zj 2 ,i) yf oo Alice receives the same message in Pi 
on inputs {x,yjf), {x,yj.^) and should output both answers yji,yj 2 , which is a 
contradiction. 
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Corollary 5. Let in the above theorem s = (logfc)/3 and I = / logk. These 

values satisfy the condition k > s + and hence there are x, y of length 
about kf!'^ with almost quadratic gap between CCf^^\x,y) < k^^^log2k and 
PCCf°^_^Jy{x,y) > fc^/^/log/c. Letting s = log log /c and I = fc/(21ogfc) we ob- 
tain x,y of length about klogk with an exponential gap between CCf^^\x,y) < 
logfclog(2fc) and PC&°f_°ly{x,y) > fc/(21ogfc). 

6 Summary of Some Selected Results for Comparison 



• yx,y,a[TCCf{x,y) > CCf{x,y) > PCCf{x,y)] by definition. 

• '^a,x,y[TCCf'l^^J^y (x,y) = TCCf{x,y) + 0(1)] Theorem 4 and discussion. 

• yn,k,o^y,\y\=n,c(y)^k^x[a < C (y) -O(logn) ^ TCCf{x,y) > n - a] Corollary 1. 

• V,,a,c[a > C(y) - 0(1) ^ TCCf{x,y) = 0(1)] Corollary 1. 

• yn,x,o3y,\y\=Tu[C{y\x) < o + O {1) Cf [x , y) >n - a] Theorem 6. 

• yx,y,a\PCCf{x,y) > C{y\x) - a - 0{\oga)] (5). 

. V„,.,,]P00|°f5“+°('^(a;,j/)<0(y)] (6). 

• V„V^,y,|^l=|j,l^„]POO°f°®0j^(x, j/) < C{y\x) + 0(logn)j Corollary 3. 

• yk,i,s:k>s+n“^x,y,\x\=\y\={2o+i)k[CC°^''\x,y) < 2'* log(2fc)&POO/4_„„y (*, j/) > /] 
Theorem 8. 

The situation gets different when we allow help bits. For space reasons we defer 
our results to the full version of this paper. 
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Abstract. We study the computational complexity of problems defined 
by formulas over fixed finite lattices. We consider evaluation, satisfia- 
bility, tautology, counting, and quantified formulas. It turns out that 
evaluation and tautology always can be decided in alternating logarith- 
mic time. For satisfiability we obtain the following dichotomy result: If 
the lattice is distributive, satisfiability is in alternating logarithmic time. 
Otherwise, it is NP-complete. Counting is i^P-complete for every lattice 
with at least two elements. For quantified formulas over non-distributive 
lattices we obtain PSPACE-completeness, while the problem is in alter- 
nating logarithmic time, if the lattice is distributive. 



1 Introduction 

Goldmann and Russell [6] investigated the computational complexity of de- 
termining if an equation over some fixed finite group has a solution. For- 
mally, an equation over a group Q = (G, o) or more generally over an algebra 
A = {A, fi, . . . , fm), is given as 

H{xi, ... ,Xn) = a 

where H is a formula over the variables X\, . . . , a:„, the constants c G A, and the 
functions fi, ■ ■ ■ , fm, and a G A is the target. The satisfiability problem over A 
is to determine if there is an assignment to the variables such that the equation 
is satisfied. Goldmann and Russell [6] showed, that this problem lies in P for 
nilpotent groups, but is NP-complete for any non-solvable group. Barrington et 
al. [2] extended these results by considering the computational complexity of 
satisfiability over a fixed finite monoid. 

Satisfiability over the boolean algebra B = ({0, 1}, {A, V, ^}) is the classical 
NP-complete problem [4], while satisfiability over the monotone boolean alge- 
bra M = ({0, 1}, {A, V}), considering only monotone boolean formulas, can be 
decided in alternating logarithmic time. 

Formulas over lattices generalize monotone formulas. A lattice is a partially 
ordered set with the functions A (greatest lower bound) and V (smallest upper 
bound). While the monotone boolean algebra A4 is a distributive lattice, lat- 
tices can be non-distributive as well. We study the computational complexity of 
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problems defined by formulas over both, distributive and non-distributive finite 
lattices. We observe the following dichotomy for a finite lattice V : 

V is distributive Satisfiability over V is in ALOGTIME. 

V is non-distributive Satisfiability over V is NP-complete. 

The paper is organized as follows. In Section 2 we define the problems to study 
over arbitrary algebras. We recall results that lead to monotone formulas and 
finally to finite lattices. Section 3 introduces lattices and classifies the tautology 
and counting problems for lattices. Section 4 deals with distributive lattices. 
We show that most problems are easy in this case. In Section 5 we complete 
the dichotomy, proving several completeness results for non-distributive lattices. 
The last section gives an overview of all results, and mentions open problems. 

We use standard notations [3,7,5]. Most proofs are given in sketch. For more 
detail we refer to the Technical Report [9] . 

2 Algebras 

We study problems that are defined by formulas over algebras. An algebra is a 
pair of sets (A,M) where A is a set of elements and M C {/ : ^ A} is a 

set of total functions. The algebra is called finite, if A and M are finite. 

Definition 1. For an algebra A = (A,M) the set of all A-formulas F(A) is 
given by 

(1) For each c G A : c is an A-formula (constant). 

(2) For each i > 1 : Xi is an A-formula (variable). 

(3) If Fi, .. . ,Fn are A-formulas, and f G M is a function f : A" ^ A, 
then f{Fi , . . . , F„) is an A-formula. 

For an A-formula F G F(A), np denotes the number of different indices of 
variables occuring in F . Let ii < ... < inp be all indices of variables in F. 
For arbitrary w = (oi,... ,Onp) G A”^, F{w) denotes the value of F, when 
replacing each Xi . by Oj and applying the functions occuring in F. We also write 
F(xi, . . . , Xn) to make clear that F is an A-formula that contains at most the 
variables Xi, . . . , x„. 

Definition 2. For a fixed algebra A = (A, M) and a G A we define the following 
evaluation, tautology, satisfiability and counting problems: 

YAh')^ =df {(F,w) I F G F(A),w G A”^,FH = a}. 

VAL^ =df {{F,w,b) I (F,w) G VAL^}. 

TAUT[\ =df {F G F(A) I Ww G A"^ : F{w) = a}. 

TAUT^ =df {{F, b)\FG TAUT[]^}. 

SAT[\ =df {F G F(A) I 3w G A"^ : F{w) = a}. 

SAT^ =df {(F,6) IFgSAT^}. 
ffiX : F(A) ^ N : F^ #{u> G A”^ | F{w) = a}. 
#^:F(A)xA^N : (F,6)^#^(F). 
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If we have a sequence of quantifiers Qi,... ,Qn G {V, 3}, then we define the 
number of alternations k{Qi • ■ ■ Qn) =df #{* € N \ I < i < n — l,Qi Qi+i}. 
Now define for k > 1 the following problems of quantified formulas: 

QAF^ =df {Qi . . . Q„F I F e F(A), n = , Qi, . . . , g„ G {V, 3}, 

QiCLi G -A • • • Qn^n ^ A . .. . , Cln} — 

QAF_4 =df {(<zF, 6) IgFGQAF^}. 

^k,A “df {Ql ■ ■ • QnF G QAF^ I Qi = 3, k{Qi ■ ■ ■ Qn) = k — 1}. 

^k,A =df {(<?F, 6) I qF G 

nt,A =df {Qi • ■ • QnF G QAF^ I gi = V, k{Qi ■■■Qn) = k-1}. 

Ilk, A =df {(<?F, b)\qF & II\,a}- 

These definitions are a generalization of the well known problems. If the algebra 
is finite, we can determine upper bounds for the problems: 

Theorem 1. Let A = (A, M) be a finite algebra, a G A and k> 1. 

1. VAL^,VAL^ G ALOGTIME. 

2. TAUT^,TAUT;^ G coNP. 

3. SAT^,SAT;\ G NP. 

4. g ffp. 

5. SuU,FIa e Fl, nk,A,nlA G nl and qaf^,qaf^ g pspace. 

Proof. The first statement follows from a result by Buss [3]: Every parenthesis 
context-free language is in ALOGTIME. The other statements are trivial, since 
A is finite. 

Theorem 1 cannot be improved in general, since the upper bounds are strict in 
the case of the boolean algebra: 

Theorem 2 (Bu87,RW99). Let &e F =df ({0, 1}, {A, V, ^}) andk>\. 

1. VAL^ is <5^9*™®-comptete /or ALOGTIME. 

2. TAUTg is -complete for coNV . 

3. SATg is -complete for NV . 

4. is -complete for #P. 

5. Aj, g, Ilf, g, and QAFg, resp. are -complete for Il\, and PSPACE, 
resp. 

On the other hand, if we drop the negation, most problems turn to be easy: 
Theorem 3 (Bu87,RW99). Let be M =df ({0,1},{A,V}) and k>l. 

1. VAL^ G ALOGTIME. 

2. TAUT^ G ALOGTIME. 

3. SAT^ G ALOGTIME. 

4. is complete for ffP. Ifff\/i is <^-complete for ffP then P = NP. 

5. Ai,^,77i,^,QAF^ G ALOGTIME. 

Turning to finite lattices, we show that non-distributivity nearly replaces the 
negation in the different results of Theorem 2 and Theorem 3. 
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3 Lattices 

Now we want to study lattices. For binary functions it is usual to use infix 
notation. The previous results hold unchanged in this case. An algebra V = 
(V,{A,V}) is called lattice^ if both binary functions A and V are commutative 
and associative, and if they satisfy 

Vo, b & V : {aAb)V b = b and (a V 5) A 5 = &. 

In a lattice V we can define a partial order < hy a < b 44>df a Ab = a. So we 
have usual relations like <, > and >. 

Definition 3. An element a G V is called a direct predecessor ofbGV, if 
a < b and {v€V\a<v<b} = {a, b}. 

We denote this as a <i b, and we call a and b neighbours. 

Now we determine a lower bound for the counting problems in finite lattices: 

Theorem 4. IfV is a finite lattice with at least two elements, then the counting 
problem ffy for any v G V is <^fffrp-complete for #P. 

Proof. Let a G V he a. direct predecessor of 5 G V. We reduce #3SAT 
to and #y. Define f{x) =df ((a; V a) A 5) and divide V into the sets 
Vo =df {v G V \ f{v) = a} and Vi =df {v G V \ f{v) = b}. Let F be 
an instance of 3SAT with variables xi, . . . ,Xn- Replace every literal Xk {~'Xk, 
resp.) by f{xk) {f{xk+n)i resp.). We obtain a R-formula Hq{xi,... ,X 2 „). Let 
Hi =df A”=i(/(^i) V f{xn+i)), i ^2 =df V”=i(/(a;i) A f{xn+i)), and H 3 =df 
{{Ho A Hi) V H 2 ). We obtain: 

#3SAT(A) • |Ro X Ril" = #^( 1 / 3 ) - #'y(i?2) 

= #y(^3)-(|R|"”-(|Rp-|RilA”) 

= {\V\^ - \Vi\Y - #-y{Ho) 

Corollary 1. IfV is a finite lattice with at least two elements, then the counting 
problem ffy is complete for ffP . 

In a finite lattice V we always have a bottom element 0 =df f\y^yV and a top 
element 1 =df These elements satisfy (Vw GV : 0 < u < 1). 

Then the monotonity of the functions A and V gives us the following Proposition: 



Proposition 1. Let V he a finite lattice, H{xi,... ,x„) G F(R) and a G V. 
Then the following equivalence holds: 

(Vw G R” : H{w) = a)AA H{Q , . . . , 0) = H{1, ... ,l) = a. 
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Theorem 5. Let V he a finite lattice and let be a G V. Then the problems 
TAUTy and TAUT^ belong to ALOGTIME. 

Proof. Using Proposition 1 it is easy to reduce a tautology problem to an eval- 
uation problem. So the tautology problems belong to ALOGTIME. 

Theorem 6. Let V he a finite lattice, k > 1, and a G V . Lt holds: 

^2fc,y G ^2fc-iJ LL2k+iy^n2k+i,v^n^k- 

Proof. The last quantiher in a quantified formula of the given problems is V. So 

let Qi, . . . , G {V, 3} and i G N with 3 = Qi Qi+i = . . . = = V. It 

holds: 

Qi ■ ■ ■ QnF{xi, . ■ . , Xn) G QAEy 
if and only if 

Qiai gV ■ ■■ Q^Oi G V : F{ai , ... , oy Xi+i, ... ,Xn) G TAUTy. 

After Theorem 5 we know TAUTy G ALOGTIME C P. Furthermore there is 
one alternation less in Qi ■■■ Qi than in Qi • • • Q„. 

4 Distributive Lattices 

Distributivity is usually a property of a whole lattice, but we introduce distribu- 
tive elements, too. An element a of a lattice V is called distributive, if 

V5, c G U : a A (& V c) = (a A 5) V (a A c) and 

V5, c G U : a V (& A c) = (a V 5) A (a V c). 

A lattice is called distributive, if each element is distributive. 

Proposition 2. Let V be a lattice, H{xi,... ,a;„) G F(U) and w G U". Each 
distributive element a G V satisfies: 

a A H{w) < H {a, .. . ,a) < aV H{w). 

Proof. We prove both via the structure of H: 

(1) H{xi,. . . ,Xn) = V GV: 

a A H{w) < H{w) = H {a, .. . ,a) = Fl{w) < a V H{w). 

(2) H{xi,. .. ,Xrf)= Xf. 

a A H{w) < a = H{a, .. . , a) = a < a V H{w). 

(3) H{xi,. .. ,Xn)= F{xi, ... ,Xn) A G(xi, . . . ,Xn): 

a A H{w) = a A (F{w) A G{w)) = (a A F(w)) A (a A G(iu)) 

< F{a , ... , a) A G(a, ... , a) = FI {a, . . . ,a) 

< (a V F{w)) A (a V G{w)) 

= a V (F{w) A G{w)) = a V H{w). 

(4) H{xi,. .. ,Xn) = F{xi, ... ,Xn)V G(xi, . . . ,x„): Dual to (3). 
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Proposition 3. Let V he a lattice, H{x\, . . . ,x„) G F(y), and let a € V he a 
distributive element. Then the following equivalence holds: 

{3w G F" : H{w) = a) H{a, ... ,a) = a. 

Proof Let H{xi,... ,x„) G F(P) and w € such that H{w) = a. From 
Proposition 2 we obtain a = a A H{w) < H {a, ... ,a) < aV H{w) = a. So this 
yields H{a, . . . ,a) = a. The converse is easy, since (a, . . . , a) € V"'. 



Theorem 7. Let V he a finite lattice and let a G V he a distributive element. 
Then the satisfiability problem SATy belongs to ALOGTIME. 

Proof Using Proposition 3 it is easy to reduce the satisfiability problem SATy 
to the evaluating problem VALy. So SATy belongs to ALOGTIME. 

Corollary 2. Let V be a finite, distributive lattice. Then the satisfiability prob- 
lem SATy belongs to ALOGTIME. 



Corollary 3. Let V he a finite lattice and let a € V he a distributive element. 
If ffv <^-complete for ffP then P = NP. 



Theorem 8. Let V he a finite lattice, k > 1, and let a G V be a distributive 
element. The problems y, iT^ y, and QAFy belong to ALOGTIME. 

Proof. It is sufficient to show QAFy G ALOGTIME. Eor v G {0, 1} (bottom and 
top element in V) define /3“(V) =df v and /3“(3) =df a. We show the following. 

Ql ■ ■ ■ QnF{xi, . ■ . , Xn) G QAEy 
if and only if 

F(/3o“(Qi), . . . , /?o“(Qn)) = FiPUQi), ■■■, PUQn)) = a. 

For n = 0 this statement is obvious. 

For n > 0: Let F(xi, . . . , Xn+i) G F(U) be a U-formula and let Qi, . . . , Q„+i G 
{V,3}. We define G(x,y) =df F(x,/3^(Q2), ■ • ■ , /3“(Q„+i)). By induction hypo- 
thesis, it remains to prove the following equivalence for Q G {V, 3}: 

(QvGV: G(v, 0) = G(v, 1) = a) AA G(/3S(Q),0) = G(/3^(Q), 1) = a. 

Q = 3: Proposition 3 shows the equivalence, since a is distributive. 

Q = V: We use Propostion 1 and obtain: 

(Vu G U : G(v, 0) = G(v, 1) = a) AA G(0, 0) = G(l, 0) = G(0, 1) = G(l, 1) = a. 

The monotonity of A and V yields anyway G(0, 0) < G(0, 1) < G(l,l) and 
G(0, 0) < G(l, 0) < G(l, 1), and we obtain the equivalence. 
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So it is sufficient to evaluate F at the following two positions: Replace every 
3-quantified variable by a, and replace every V-quantified variable by 0 (1, resp.) 
for the first (second, resp.) evaluation. The quantified formula is in QAFy if and 
only if both evaluations of F return the value a. 

Now it is easy to reduce the problems n^ y, and QAFy to an evalua- 

tion problem. So these problems belong to ALOGTIME. 



Corollary 4. Let V he a finite, distributive lattice and k > 1. Then the problems 
^k,v, TIk,v, and QAFy belong to ALOGTIME. 

So most of the problems are easy in finite, distributive lattices. But what 
happens, if the finite lattice is non-distributive? 



5 Non-distributive Lattices 

According to Birkhoff [1], each non-distributive lattice has at least one of the 
following two lattices as a sublattice: 




Pentagon Diamond 



Definition 4. Let V be a non- distributive lattice. A triple (a, b, c) € such 
that a A b = c A b, aV b = cV b, and either a < c or {a c, a A c = a A b and 
a\J c = a\J b) , is called non-distributive. 



Definition 5. A non- distributive triple (a, 6, c) G is called maximal non- 
distributive, if each non- distributive triple {a',b',d) G satisfies: 

(Ml) a' A b' ^ a Ab, 

(M2) a' Ab' = a Ab ^ a' y b' -fi. a\J b, 

(M3) {a' Ab' = a Ab, a' \J b' = a\J b, a c) ^ a' fit c' . 

Obviously each finite non-distributive lattice has at least one such triple. 

Lemma 1. Let V he a lattice and let (a, b, c) € be a maximal non- distributive 
triple. Then each G V satisfies: (c+ > c and c~^ c V 6) c+ A 5 = c A &. 

Proof. At first let us define b' =<jf c+ A b, c' =dt b' V c, and a' =df b' V a. 
Supposing b' c Ab we get one of the following four cases, all of them leading 
to a contradiction: 
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Case la Case lb 




Case 2a 



Case 2b 



a’ 

a 



Case 1: a < c. (Pentagon-case) 

Case la: a' < c' . Here we obtain the situation shown in the first picture. 
{a',b,c') is a non-distributive triple, contradicting (Ml). 

Case lb: a' = d . Here we obtain the situation shown in the second picture. 
{a,b',c) is a non-distributive triple, contradicting (M2). 

Case 2: a ^ c. (Diamond-case) 

Case 2a: b < a' , therefore a' = aV b. Here we obtain the situation shown in the 
third picture, {b' , a, b) is a non-distributive triple, contradicting (M3). 

Case 2b: b o', therefore o' < a V b. Here we obtain the situation shown 
in the last picture. Let V' =df {v & V \ v > b'}. We get a',b,c' S V' and 
(a'Vc')Ab (a'A5)V(c'A5). This implies, that V is a non-distributive sublattice. 
Hence V contains at least one non-distributive triple, contradicting (Ml). 



Theorem 9. Let V he a finite, non- distributive lattice. Then there exists a c' € 
V, so that the satisfiability problem SATy is -complete for NP. 



Proof. Let C be a finite, non-distributive lattice, (a, b, c) € a maximal non- 
distributive triple, and c' € V such that c < c' <i cV b. 

We reduce 3SAT to SATy. Let F be an instance of 3SAT with variables 
xi, . . . ,x„, i. e. F = /\™ Vi=i Xij with Xij e {xi,... ,x„, -'Xi, ... , ~^x„}. 

First we define f(x) =df ((x V c') A (c V b)) and =df 

So we can define the following F-formulas: 



f k if Xij = Xk, 

( n -I- fc if Xij = -'Xk- 



A Tn \ /3 

Ho{xi,... ,X2n) =df = 

A n 

, ,{f{Xi)y f{Xnj-i)), 
1—1 

V n 

. A f{Xn+i)), 

1—1 

iLg(xi, . . . ,X2n) =df {Ho A 6) V o, 

H[{xi, . . . ,X2n) =df {Hi A 6) V o, 
iL 2 (xi, . . . ,X 2 ri) =df (^2 A 6) V c', and 

H{xi, . . . , X 2 n) =df Hq a Hi a H 2 . 



With c' <1 c V & each v & V holds: f{v) G |c', c V b}. So we get for each w G 
Hq{w), Hi{w), H 2 {w) G {c',cV b}. With Vo =df {v G V \ f{v) = c'} and 
Fi =df {v GV \ f{v) = cAb} we can define Hy for each y = {yi, . . . , yn) G {0, 1}” 
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by fiy =df {(wi, . . . , W2n) G I Vi(l <i<n) : (wi, w„+i) G Vy^ x %-}. 

We show: 

ru G fl y H {w) = c' . 

t"(y)=i 

First, let be y G {0, 1}” such that F{y) = 1, and let w G Qy. We can calculate 
Ho{w) = cV b, Hi{w) = cV b and H 2 {w) = d . So we obtain Hq{w) = cV b, 
H[{w) = c\J b and H^iw) = c' , and it holds H{w) = c'. 

Now let w = {wi,... ,W 2 n) ^ UF(y)=i obtain at least one of the 

following three cases: 

(1) G 

(2) 3f(l <i<n) : f{wi) = /(w„+i) = c'. 

(3) 3i(l <i<n): f{wi) = f{wn+i) = cV b. 

As consequences we obtain from the three cases: 

(!') Ho{w) = c', therefore by Lemma 1: Hq{w) = a. 

(2') Hi{w) = c' , therefore by Lemma 1: H[{w) = a. 

(3') H 2 {w) = cV b, therefore we obtain: = cV b. 

If Hq{w) = a or H[{w) = a, then H{w) < a, so H{w) ^ c' , because c' a. 

Otherwise Hq{w) = H[{w) = cV b and with (3') this yields H{w) = c\/ b ^ d . 

So each w ^ Uf(i/)=i satisfies H{w) ^ d . Now we have: 

F e 3SAT -^H £ SAT^. 

The Wformula F[ is computable with input F’ by a log-space algorithm, so the 
satisfiability problem SATy is <J^®-complete for NP. 

This completes the dichotomy to Corollary 2: 

Theorem 10. Let V he a finite lattice. 

V is distributive SATy e ALOGTIME. 

V is non-distributive ^ SATy is -complete for l^P . 

This is the main dichotomy result, and we will extend it by similar results 
for quantified formulas over finite lattices. 

Let B =df ({0, 1}, {A, V, ^}). A ,B-formula F G F(,B) is called simple, if ^ 
only occurs at variables. Now we define a simple version of QBF, which is still 
<5^S-complete for PSPACE: 

SQBE =df {<5i • • • QnF{xi, ... ,Xn) G QAFg | F is simple}. 



Theorem 11. Let V be a finite, non- distributive lattice and k>l. Then there 
is a d G V such that: 




40 



B. Schwarz 



^ 2 k-i,v is <1^^ -complete for 

n^k V is -complete for and 

QAFy is <lff -complete for PSPACE. 



Proof. The proof of Theorem 9 does not need the formula F to be in 3CNF. 
It is sufficient, that F is simple. We obtain the E-formula Hq from a simple 
instance F by replacing every literal Xi by f{xi) and replacing every literal ^Xi 

by f{Xn+i). 

So let Qi • • • QnF{xi, . . . , x„) be an instance of SQBF. We are now using the 
notations of the proof of Theorem 9, in particular the elements c', c V 6 G V, the 
function /, and the E-formula Ft{xi, . . . , X 2 n)- Furthermore we define additional 
quantifiers Qn+i =df 3 for i = 1, . . . ,n and a mapping a : E ^ {0, 1} by 



Vu G E : a{v) =df 



0 if f{v) = c', 

1 if f{v) = c\J b. 



Finally we define cxk{vi , . . . , V]f) =df (Q!(ui), • . . , cx{vk)) as a shortcut. 
We prove by induction on z(0 <i<n): 



y G {0, iy,Qi+i ■ ■ ■ Q„F{y,Xi+i,. . . ,a;„) G SQBF 
Vu G Ctj (?/) . • • • Q2nF {y ^ Xj-i-i, . . . , X2n) G QAF^ 



and 



uGEjQj-i-i''' Q 271 F {y ^ XiJ^ \ , . . . , X2n) G QAFy 
Qi-\-l ' ' ' QnF(^CXi(^vy Xj+1, . . . , G SQBF. 

Case i = n: Already shown in the proof of Theorem 9. 

Case i < n: Let Qi=y {Qi = 3, resp.). 

First let y G {0, 1}*“^ and Qi - ■ ■ QnF{y, Xi, . . . , Xn) G SQBF. Then we have: 
Qiyi G {0, 1} . • • • QyiF{jj^ yij x^-i-i, . . . , Xjfj G SQBF, 



which results in: 



Qi+l ' ' ' Qn^iy^ d; : Xn) G SQBF 

and (or, resp.) 

Qi-t-l ' ' ' QnF{y j 1, Xj+1, . . . , Xnf G SQBF. 
By induction hypothesis, we obtain for all v G cx~\{y): 



Vvi G a 1(0) : Qi+i ■ ■ ■ Q 2 nH{v,Vi,Xi+i, . . . ,X 2 „) G QAFy 
and (or, resp.) 

VXj G CX (1) . • • • Q 2 nF {y X^-I-I, . . . , X2n) G QAF^. 
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Furthermore V = a ^(0) U a ^(1). (In particular: c' € a ^(0) ^ 0 and cV b & 
q;“^(1) 7 ^ 0.) Hence it holds: 

Vu C . QiVi C F . • • • (^ 2 nH {y ^ Vij ■ ■ ■ , X 2 n) ^ QAFy. 

So we have: Vu G • Qi ' ' ' Q2n,H{v, Xi, . . . ,X2n) G QAFy. Hence the 

first statement is proved. To prove the second statement, let v G and 

Qi - ■■ Q2nH{v,Xt, . . . ,X2n) G QAFy. Then we have: 



QiXi G F . Qi-\-\ • • • Q2n^{y^ '^it ■ ■ ■ 5 ^2n) G QAF 






By induction hypothesis we get: 



QiXi ^ V . Qi-\-\ * • • QjiF(^CXi(^V^Vi^^ Xi-\. \ , . . . , Xy,) G SQBF . 

Together with a{V) = {0,1} this yields: Qi - ■ ■ QnF{ai-i{v),Xi, . . . ,x„) G 
SQBF. This completes the proof of the second statement. 

In particular, the two statements for i = 0 result in: 

Qi" ■ QnF{xi, . . . ,x„) G SQBF <S4> Qi • • • Q2nH{xi, . . . ,X 2 „) G QAFy. 

The reduction does not change the number of alternations between V and 3, if 
the last quantifier is an 3-quantifier. 

Together with Theorem 6 this completes the dichotomy to Corollary 4: 
Theorem 12. Let V be a finite lattice, and k>l. 

V is distributive y, iTj, y, QAFy G ALOGTIME. 

{ S 2 k-I,v, F 2 k,v are -complete for F^k-i- 

n 2 k,v, n 2 k+i,v are -complete for 11^;.. 

QAFy is -complete for PSPACE. 

At last we show that a counting problem of a finite lattice can be <(((®-complete 
for ^P. In Corollary 1 we have only shown <5^j,-completeness. 

Theorem 13. Let Y i be the ” diamond” -lattice. The counting problem is 
-complete for #P. 

Proof We reduce #3SAT to #Vi • 

Let F be an instance of 3SAT with variables xi,. . . ,x„. Replace every literal 
Xk by (xfe V b) and replace every literal —^Xk by {xn+k V b). We obtain a Vi- 
formula Hz{xi,... ,X 2 n)- Furthermore let Hi =df ^ ^n+i) V c) A a), 

H2 =df vr=i(((^i ^ ^ri+i) y b) ha), Hi =df Ar=i(((®* V c) A a) V &), H 5 =df 
Ar=i(((^"+* V c) A a) V b), and H =df Hi V H 2 V (H 3 A H 4 A H 5 A c). We obtain: 



#vA^) = #3SAT(H) 
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6 Results and Open Questions 

Let y be a finite lattice with at least two elements, and fc > 1. 



Problem 


Complexity of the problem, if V is 




distributive 


non-distributive 


VALy 


ALOGTIME 


ALOGTIME 


TAUTy 


ALOGTIME 


ALOGTIME 


SATy 


ALOGTIME 


<5^®-complete for NP 


^2k-l,V 


ALOGTIME 


<^®-complete for 


^2k,V 


ALOGTIME 


<m®-cornplete for 


n2k,V 


ALOGTIME 


<^®-complete for 77 


n2k-\-i,v 


ALOGTIME 


<^®-complete for 77 


QAFy 


ALOGTIME 


<5^®-complete for PSPACE 






<*i'^.p-complete for ffP 


#v 


if P yf NP : 




not <^-complete for #P 





For the ’’diamond” -lattice Vi the counting problem #Vi is <J^®-complete 
for ^P. So the question arises, which finite lattices have counting problems that 
are <J^-complete for #P? 

It was shown that P ^ NP implies that distributive lattices do not have 
counting problems, being <^-complete for ^P. But maybe this claim can be 
shown independently. 

We showed that for a distributive element a G V the element-specific problem 
SATy is in ALOGTIME. On the other hand we showed, that in every non- 
distributive lattice V there is at least one element c' G V, so that SATy is 
<J^®-complete for NP. Of course this element is not distributive. But which 
computational complexity has the problem SATy for an arbitrary element a? 



Acknowledgments. I have benefited greatly from discussions with Klaus Wag- 
ner, and I want to thank Christian Glasser for helpful discussions. 

References 

[1] G. Birkhoff, Lattice Theory, Colloquium Publications Vol. XXV, American Math- 
ematical Society, Providence, RI, 1967. 

[2] D. M. Barrington, P. McKenzie, C. Moore, P. Tesson, and D. Therien, Equa- 
tion Satisfiability and Program Satisfiability for Finite Monoids, Proceedings of 
the 25th International Symposium on Mathematical Foundations of Computer 
Science, pages 172-181, 2000. 

[3] S. R. Buss, The Boolean formula value problem is in ALOGTIME, Proceedings 
of the 19**’ Symposium on Theory of Computing, pages 123-131, 1987. 

[4] S. A. Cook, The complexity of theorem-proving procedures, Proceedings of the 
Third ACM Symposium on Theory of Computing, pages 151-158, 1971. 




The Complexity of Satisfiability Problems over Finite Lattices 



43 



[5] M. R. Carey, D. S. Johnson, Computers and Intractability, W. H. Freeman and 
Company, 1979. 

[6] M. Goldmann, A. Russell, The complexity of solving equations over finite groups, 
Proceedings of the 14th Annual IEEE Conference on Computational Complexity, 
pages 80-86, 1999. 

[7] C. H. Papadimitriou, Computational Complexity, Addison- Wesley, Reading, MA, 
1994. 

[Ru81] W. L. Ruzzo, On uniform circuit complexity. Journal of Computer System 
Sciences, 22:265-383, 1981 

[8] Steffen Reith, Klaus W. Wagner, The Complexity of Problems Defined by 
Subclasses of Boolean Functions, Technical Report 218, Inst, fiir Informatik, 
Univ. Wurzburg, 1999. Available via ftp from http://www.informatik.uni- 
wuerzburg. de/ reports / tr. html 

[9] Bernhard Schwarz, The Complexity of Satisfiability Problems over Finite Lattices, 
Technical Report 314, Institut fiir Informatik, Universitat Wurzburg, 2004. Avail- 
able via ftp from http://www.informatik.uni-wuerzburg.de/reports/tr.html 




Constant Width Planar Computation 
Characterizes ACC^ 



KristofTer Arnsfelt Hansen 

Department of Computer Science, University of Aarhus, Denmark 
arnsf eltOdaimi .au.dk 



Abstract. We obtain a characterization of ACC° in terms of a natural 
class of constant width circuits, namely in terms of constant width poly- 
nomial size planar circuits. This is shown via a characterization of the 
class of acyclic digraphs which can be embedded on a cylinder surface in 
such a way that all arcs flow along the same direction of the axis of the 
cylinder. 



1 Introduction 

This paper deals with the relationship between the computational power of width 
restricted circuits and depth restricted circuits. We relate constant width poly- 
nomial size planar circuits and also nondeterministic branching programs to 
constant depth polynomial size circuits. 

Constant width polynomial size circuits (and branching programs) were 
shown to have surprising computational power by Barrington [1]. The class of 
functions computed by constant width polynomial size circuits is exactly NC^, 
and is thus considerably more than the functions computed by constant depth 
polynomial size circuits, being AC°. 

Such connections are very interesting to explore, since they might provide 
the means for a better understanding of the classes involved, thus approaching 
lower bounds. Currently however, obtaining lower bounds for NC^ seems out of 
reach. 

The smallest natural circuit class lacking lower bounds is ACC°, the subclass 
of NC^ computed by constant depth polynomial size circuits allowing MOD 
gates. By restricting the digraph representation of circuits geometrically, we 
obtain a characterization of ACC° in terms of constant width circuits. 

Theorem 1. Constant width, polynomial size planar circuits compute exactly 

ACC°. 

Although Barrington and Therien gave a characterization of ACC° in terms 
of finite monoids [4] and Yao proved a nontrivial upper bound for ACC° in 
terms of a class of threshold circuits [16], our result is the first alternative char- 
acterization of ACC° by a circuit model. 

Planarity has been previously employed in circuit lower bounds. While for 
general circuits, the best lower bounds for explicit functions are linear, superlin- 
ear lower bounds are known for general planar circuits. This was first obtained 
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by Lipton and Tarjan based on their planar separator theorem [10], although 
their lower bound did not allow inputs to appear more than once, as we re- 
quire in the above characterization. For general planar circuits allowing inputs 
to occur several times, superlinear lower bounds were proved by Turan, [13] and 
improved to the current best lower bound of l7(nlog^ n) by Groger [7]. 

The circuits of Theorem 1 are restricted in two more aspects than planarity, 
namely that of constant width and of monotonicity in gate operations. Thus there 
is hope of obtaining much better lower bounds than just slightly superlinear. 
Furthermore one could hope that the geometric perspective on computation can 
lead to new ways of understanding the internal structure of NC^, thus obtaining 
progress towards separating ACC° and NC^. 

Much of the naturalness of ACC° comes from the algebraic characterizations 
by Barrington and Therien of the classes AC°, ACC° and NC^ in terms of 
restrictions of finite monoids [4]. The class AC° has also been characterized in 
terms of geometric restrictions by Vinay [14] and Barrington et al [2,3]. It was 
shown that the class of functions computed by constant width polynomial size 
upwards planar circuits (and nondeterministic branching programs) is exactly 
AC°. 

An intermediate geometric restriction between upwards planarity and pla- 
narity was studied in [8]. There progress towards characterizing ACC° was 
obtained by relaxing the geometrical restriction of upwards planarity to that of 
cylindricality. While the exact relation to constant depth circuits is unknown un- 
der this restriction, it was shown that constant width polynomial size cylindrical 
circuits (and nondeterministic branching programs) can compute a strict super- 
class of AC°, while still only computing functions in ACC°. It is by building 
upon this result that we obtain Theorem 1. 

The restrictions of upwards planarity and cylindricality are similar in the 
sense that each layer of the circuit is drawn “together”, in such a way that all 
arcs flow in a common direction. Under the restriction of planarity, nodes are in 
contrast allowed to be placed in an arbitrary way in the plane. 

The results on the computational power of cylindrical circuits, as well as the 
characterization of AC° in terms of upwards planar circuits are actually based 
upon the algebraic characterizations by Barrington and Therien[4]. Thus with 
the present characterization of ACC° and the previous characterizations of AC° 
in terms of geometric restrictions, the link between algebra and geometry in a 
computational setting seems very strong. 

The key to applying the results on the computational power of cylindrical 
circuits for proving Theorem 1, is in identifying exactly which planar digraphs 
are cylindrical. 

Theorem 2. A layered digraph D is layered cylindrical if and only if it is a 
subgraph of an acyclic planar layered digraph with a unique source and sink. 

This theorem is implicit in the works of Tamassia and Tollis [12] on tessel- 
lation representations of graphs in the plane and on a sphere. By “cutting” a 
digraph along a path from the source to the sink, they effectively reduce the 
spherical case (or equivalently the cylindrical case) to the planar case. 
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We give another proof of the theorem, working directly on the given planar 
embedding, taking advantage of the digraph being layered. This allows us to di- 
rectly extract the combinatorial characterization of cylindricality used in [8] and 
also makes later uniformity considerations easier. With appropriate definitions 
of uniformity we can obtain the following uniform version of Theorem 1. 

Theorem 3. Log space-uniform constant width, polynomial size planar circuits 
compute exactly logspace-uniform ACC°. 

Organisation of Paper. In Sect. 2 we introduce the notions of embeddings 
and circuits we will consider. We also state some basic properties about planar 
embeddings of certain digraphs. In Sect. 3 we prove Theorem 2, and in Sect. 
4 we prove Theorem 1, as well as characterizing the computational power of 
planar branching programs. We introduce combinatorial embeddings in Sect. 5 
as a means for dealing with uniformity issues. We conclude with some discussions 
and open problems in Sect. 6. 

2 Preliminaries 

Bounded Depth Circuits. AC° is the class of functions computed by polyno- 
mial size bounded depth circuits consisting of NOT gates and unbounded fanin 
AND and OR gates. ACC° is the class of functions computed when we also 
allow unbounded fanin MOD gates computing MOD^ for constants to. 

Planar and Cylindrical Embeddings of Digraphs. A digraph D = (V, A) 

is called layered if there is a partition V = Vb 0 Vi U • • • U V/j such that each arc of 
A goes from layer Vi to the next layer Vi+i for some i. Given such a partition, 
we call h the depth of D, \Vi\ the width of layer i and k = max \ Vi\ the width of 

D. 

A digraph is planar if it can be embedded in the plane. It is called upward 
planar if it can be embedded in the plane, in a way such that all arcs are 
monotonically increasing in the vertical direction. It is called cylindrical if it can 
be embedded on a cylinder surface, in a way such that all arcs are monotonically 
increasing in the direction of the axis of the cylinder. 

We call a layered digraph layered cylindrical if it can be embedded on a 
cylinder surface, such that all arcs are monotonically increasing in the direction 
of the axis of the cylinder and that layers correspond to disjoint circles of the 
cylinder, which contain all the nodes of the layer. 

In [II] the following properties of planar embeddings are proved. They are 
stated under the assumption of the digraph being 2-connected, but the proof 
does not use this assumption. 

Lemma 4. Let D he a planar acyclic digraph with a unique source and sink, 
and let D he any planar embedding of D. Then 

1. Each face of D consists of two directed paths. 

2. For any vertex v of D all ingoing (outgoing) arcs of v appear consecutively 
around v. 
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Bounded Width Branching Programs and Circuits. A nondeterministic 
branching program is an acyclic digraph where all arcs are labelled by either a 
literal, i.e. a variable or a negated variable, or a boolean constant, and an initial 
and a terminal node. An input is accepted if and only if there is a path from 
the initial node to the terminal node in the graph that results from substituting 
constants for the literals according to the input and then deleting arcs labelled 
by 0. 

When referring to constant width branching programs we will mean a nonde- 
terministic branching program, which when viewed as a digraph, is layered and 
of constant bounded width. 

By a constant width circuit we mean a circuit consisting of fanin 2 AND and 
OR gates and fanin 1 COPY gates, which when viewed as a digraph, like for 
branching programs, is layered and of constant bounded width. Input nodes can 
be literals or boolean constants, and can occur anywhere in the circuit. 

Viewing nondeterministic branching programs and circuits as digraphs, it 
makes sense to restrict them geometrically. Since all cylindrical embeddings con- 
sidered are layered cylindrical, we will usually just write cylindrical instead of 
layered cylindrical. 

3 Embedding Digraphs on a Cylinder 

In this section we will prove Theorem 2. The “only if” part is easily obtained: 
Consider a layered cylindrical embedding of a layered digraph D. By adding new 
arcs one can eliminate sources and sinks appearing in all layers except the first 
and the last. Now add a new first layer with a single node with arcs to all nodes 
in the previous first layer. Similarly add a new last layer with a single node with 
arcs from all nodes in the previous last layer. We have thus obtained an acyclic 
layered cylindrical superdigraph of D with a unique source and sink. The proof 
is completed by observing that every cylindrical embedding can be transformed 
into a planar embedding. 

The “if” part is proved in several steps. First we will obtain a suitable parti- 
tion of the planar embedding, and then later use this to obtain a new embedding 
on a cylinder surface. 

Consider a planar embedding D of a layered digraph D. Let V{D) = Vb U 
■ ■ ■ UVh be the layers of D. Let C be a closed curve in the plane, partitioning 
the plane into two regions Ri and i? 2 - We say that C is a separating curve for 
layer i if the following two properties hold. 

1. The intersection of C and D consists exactly of the nodes in Vi. 

2. One of the regions Rj contain the embedding of subdigraph induced by 

Vb U • • • U Vi-i as well as the arcs from E_i to Vi, and the other region 

contains the embedding of the subdigraph induced by U • • • U 14 as well 

as the arcs from Vi to E+i 

The following proposition shows that we can find separating curves for all 
layers in an acyclic planar layered digraph with a unique source and sink. This 
is illustrated in Fig. 1. 
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Fig. 1. Curves separating an acyclic planar layered digraph. 



Proposition 5. Let D be an aeyclic planar layered digraph with a unique source 
and sink, with layers V (D) = Vb U • • • U 14, and let D he a planar embedding of 
D. Then there exists disjoint curves C\,. .. ,Ch-i such that Ci is a separating 
curve for layer i with respect to D for all i. 

Proof. We will construct the separating curves by an iterative process. Assume 
that we have found separating curves C\, , Ci-\. 

Let V be a node in D which is neither the source nor the sink. That is, v 
has at least one ingoing arc and one outgoing arc, and by Lemma 4 all the 
ingoing arcs and outgoing arcs appear consecutively around w in Z?. It is then 
meaningful to talk about the rightmost ingoing (outgoing) arc, and the leftmost 
ingoing (outgoing) arc. 

Since the ingoing arcs to T)_i and outgoing arcs from T)_i are separated by 
Ci-i it follows, that when traversing (7i_i from a node between the leftmost 
ingoing and outgoing arcs the next node is approached between the rightmost 
ingoing and outgoing arcs. 

We now find a separating curve Ci for layer i by the following process: Start 
in an arbitrary node v in layer i. Follow along the left of the leftmost ingoing 
arc a to a node w in layer i — 1. If a is the leftmost outgoing arc of w we follow 
the curve Ci-i to the next node w' and follow the rightmost arc to a node v' 
in layer i. Otherwise we follow the next outgoing arc (in the counterclockwise 
order) to a node v' in layer i. Since both v and v' belong to the boundary of 
the face of D we are within and because they belong to the same layer of D, it 
follows from Lemma 4 that v' is approached along the rightmost ingoing arc. 




Fig. 2. Finding separating curves. 
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We now continue the same process, as illustrated in Fig. 2, until a closed 
curve Ci is found. It is clear that Ci has the following properties: 

1. Ci only intersects D in nodes from Vi. 

2. Ci is disjoint from C\, . . . , Ci-\. 

3. The region R partitioned by Ci containing Ci-i does not contain nodes from 
layers i, . . . ,h. 

of which the last holds because D has a unique sink. 

Now, since all nodes in Vi have an ingoing arc, and since they are not in the 
region R containing Ci-i by (3), it follows from (1) that Ci intersects exactly 
the nodes in Vi. From this and (3) it follows that Ci is a separating curve for 
layer i. 

The next step is to associate an orientation (clockwise or counterclockwise) 
to each of the separating curves. These will give the order in which the nodes of 
each layer is to be drawn around a circle of the cylinder. 

We give the first curve C\ the counterclockwise orientation. Now assume we 
have assigned orientations to Ci, . . . , Ci-\. We assign an orientation to Ci based 
on whether it contains Ci-\ or vice versa. There will be three cases as illustrated 
in Fig. 3: 

(a) Ci-i is inside the region bounded by Cj. 

(b) The regions bounded by Ci_i and Ci are disjoint. 

(c) Ci is inside the region bounded by Ci-\. 




(a) 



(t>) 



(f^) 



Fig. 3. Orienting the separating curves. 

In cases (a) and (c) we assign the same orientation to Ci as Ci-\. In case (b) 
we assign the opposite orientation to Ci as Ci-i. 

From the properties of separating curves it follows that case (b) can only 
occur once. In fact there is a fc such that if t < fc, is oriented counterclockwise 
and li i > k, Ci is oriented clockwise. 

For nodes a and 6 on a curve C, define the segment from a to b, to be 
the segment of C traversed by following C from a to b (inclusive) along the 
orientation associated to C. 
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The crucial property is now the following: Let a and b be nodes of layer i—1 
with arcs to c and d in layer i respectively. Then nodes (except a and b) on the 
segment from a to 6 have only arcs to the segment from c to d (see Fig. 3). This 
is in fact, basically the characterization of cylindricality that was used in [8]. 

In fact, by this property, it follows that in Proposition 5, the separating curves 
found are actually traversed according to the orientations assigned above. 

Furthermore, assign an ordering of the outgoing (ingoing) arcs from (to) a 
layer by traversing the separating curve according to the associated orientation, 
and including the outgoing arcs (ingoing arcs) of each node in counterclockwise 
(clockwise) order. Then it follows from the above property that the ordering of 
the outgoing arcs from a layer coincide with the ordering of the next layer, where 
the arcs are ingoing. 

We are now in position to create a layered cylindrical embedding of D. The 
nodes in layer i are placed in a circle around the cylinder in the order they are 
met, when traversing the curve Ci according to the associated orientation, and 
the arcs between layers are simply drawn in the order described above. 

This completes the embedding and the proof of Theorem 2. 

4 Planar Branching Programs and Circuits 

In this section we characterize the computational power of planar circuits (and 
nondeterministic branching programs). First we show how to compute any 
ACC° function by a constant width polynomial size planar circuit. 

The core of the simulation is the following substitution lemma for planar 
circuits. Using this we only need to show how to compute AND, OR and MODm 
by planar circuits, to establish that planar circuits can compute all of ACC°. 

Lemma 6. If f{xi , . . . , x„) is computed by a planar circuit of size Si and width 
wi and gi, . . . ,gn are computed by planar circuits each of size S 2 and width W 2 
then f{gi,...,g„) is computed by a planar circuit of size 0(sis|) and width 
0{wiW2)- 

Proof. By exchanging AND with OR gates and vice versa and by negating in- 
puts, gi, ... ,gn are also computed by planar circuits of size S 2 and width W 2 . 
Consider any planar embedding of a circuit C for / and choose planar embed- 
dings of circuits Ci, . . . ,C„ and Ci, . . . ,Cn for gi, . . . ,g„ and gi, . . . ,gn with the 
output gate appearing on the outer face. 

We now stretch each layer of C into S 2 layers, by replacing each arc by a string 
of S 2 — 1 COPY gates, preserving planarity of the embedding. This ensures that 
input nodes in different layers are at least S 2 layers apart, and only increases the 
size by a factor 0 {s 2 ) and the width by a factor 2. 

This now allows us to simply substitute the embedding of Ci for Xi and the 
embedding of Ci for Xi in the embedding of C preserving the planarity, without 
increasing the width by more than a factor 0 {w 2 ). 

In [8] the construction for AND, OR and MOD^ is given in terms of cylin- 
drical branching programs, and hence also in terms of planar circuits. For com- 
pleteness we provide the details of the constructions directly in terms of planar 
circuits here. 
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The AND (and OR) of n inputs is easily computed by a planar circuit of 
width 2 and size 0(n) as shown in Fig. 4. 




Fig. 4. A planar circuit computing AND. 



The construction for MOD^ is more complicated, an example of it is shown 
in Fig. 5. 

It can be computed by a planar circuit of width 0{m) and size 0{mn) con- 
structed as follows: We will have 2n-|- 1 layers. The first layer consists of a single 
input gate with the constant 1. In the last 2n layers the main part consists of m 
sufficiently long strings, which we number 0, . . . , m — 1, of alternating AND and 
OR gates taking each other as input, the AND gates in odd layers and the OR 
gates in even layers. 

String 0 starts with an AND gate and all other strings start with an OR 
gate. The constant input 1 in layer 0 will take the place of an OR gate in string 
0. The construction will ensure that the OR gate in layer 2i of string j evaluates 
to 1 if and only if = 3 (mod m). 

The first AND gate in string 0 takes the constant input 1 as input. The first 
OR gate in the other strings takes a constant input 0 as input. The AND gate 
in layer 2i -|- 1 of every string takes Xi as the other input. The first m — 1 OR 
gates in string 0 take a constant input 0 as the other input. The other OR gates 
in string 0, say in layer 2i, take an AND gate as the other input, which is fed 
by Xi and the OR gate in layer 2{i — 1) of string m — 1. In general the OR gate 
in layer 2i of string j takes an AND gate as the other input, which is fed by Xi 
and the OR gate in layer 2{i — 1) of string j — 1. The output gate is the last OR 
gate of string 0. 

Note, that while the above constructions are cylindrical, once we use Lemma 
6 to substitute a circuit computing MOD^ on more than m inputs, we leave the 
class of cylindrical circuits. 

To conclude, we can from an ACC° circuit construct a constant width, 
polynomial size planar circuit computing the same function as follows: Expand 
the ACC° circuit into an ACC° formula, and move the NOT gates to the 
layer just above the inputs by the usual constructions. Now proceeding layer by 
layer, we build appropriate components for the AND, OR and MOD^ gates, 
and compose these using the substitution lemma. 

We now prove the last part of Theorem 1, that constant width, polynomial 
size planar circuits only compute functions in ACC°. 

We need the following theorem from [8] . 

Theorem 7. Every boolean function computed by a constant width, polynomial 
size cylindrical circuit is in ACC°. 
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Fig. 5. A planar circuit of width 9 computing MOD 3 on 5 inputs. 



Corollary 8. Letp be a polynomial and w any constant. Then there is a polyno- 
mial q and a constant h such that every boolean function on n inputs computed 
by a cylindrical circuit of size 3p{n) and width 2w is computed by an ACC° 
circuit of size q{n) and depth h. 

Now consider a planar circuit C on n inputs of size at most p{n) and width 
at most w. Let q and h be the polynomial and the constant from Corollary 8. 
We show that C is computed by an ACC° circuit of size q{p{n))^ and depth 
wh, assuming without loss of generality that 1 +p(n) < q{p{n)). 

If the width of C is in fact 1, then the function computed by C is certainly 
also computed by an ACC° circuit of size q{p{n)) and depth h. 

Otherwise, pick a node v in the first layer of C and let D be the subdigraph 
of the digraph representation of C, which is induced by the nodes which are on 
a path from v to the output node of C. 

Removing the nodes in D, note that the remaining components are digraph 
representations of planar circuits. There are at most p{n) of these each of size 
p{n), but with width ru — 1, since at least one node is removed from each layer. 
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By induction, all these are computed by ACC° circuits of size < 7 (p(n))™“^ and 
depth {w — l)h. 

Now by Theorem 2, D is cylindrical, and we construct a cylindrical embed- 
ding of it. For all AND and OR gates that have indegree 1 in D we will add a 
new input variable in the previous layer. In order to preserve layered cylindrical- 
ity, we first stretch each layer into two layers using COPY gates. Together this 
will yield a cylindrical circuit of size at most 3p(n) and width at most 2w on at 
most p{n) inputs, which by Corollary 8 is computed by an ACC° circuit of size 
q{p{n)) and depth h. Now substitute the functions constructed inductively in this 
circuit to obtain an ACC° circuit of size q{p{n)) + p{n)q{p{n))'^~^ < q{p{n))^ 
and depth wh computing the function computed by C. This completes the proof 
of Theorem 1 . 

Turning to the computational power of planar nondeterministic branching 
programs, we can easily characterize this by Theorem 2. Indeed we can assume, 
without loss of generality that every node is on a path from the initial node to 
the terminal node, and hence Theorem 2 applies to give the following theorem. 

Theorem 9. Constant width, polynomial size planar branching programs com- 
pute the same boolean functions as constant width, polynomial size cylindrical 
branching programs. 

As noted in [8] this class of functions does not seem to capture all of ACC°. 
The compositional approach that worked so well for planar circuits using Lemma 
6 fails for planar branching programs: in order to substitute a branching program 
for an arc, preserving planarity, one would need an embedding of it with both 
the initial and the terminal node on the external face. By a theorem by Kelly 
[9] and Battista and Tamassia [5] they would then need to be upwards planar 
and thus by the results by Barrington et al [2] , substituted branching programs 
can only contribute with AC° functions. 

5 Uniformity Considerations 

When considering uniformity, we need to be more precise when we talk about 
embeddings. We will employ the concept of combinatorial planar embeddings 
based on Edmonds’ permutation technique [6] (see also [15] pages 70-73). 

Combinatorial embeddings are most conveniently introduced for undirected 
graphs. Let G = {V, E) be an undirected graph. For a vertex u define the set 
of neighbours N{u) = {u||{m, v} G E} and let A = {uv |{w,u} G E} be the set 
of all oriented arcs obtained from E. Now let Pu '■ N{u) — >■ N{u) be a cyclic 
permutation of the neighbours of u. Define P : A — >■ A by P{uv) = vpy{u). 
Observe that P is a permutation of A. 

As Edmonds now observed, there is a one-to-one correspondence between 
choices of {pu} and 2- cell embeddings of G into closed orientable surfaces, where 
faces of the embedding correspond to the orbits of P. 

We can thus say that {pu} is a combinatorial planar embedding of G if and 
only if Euler’s formula v — e f = c-\- 1 is satisfied, where v = \V\, e = \E\, f is 
the number of orbits in P and c is the number of connected components of G. 




54 



K. Arnsfelt Hansen 



A description of a combinatorial planar embedding of a digraph D then 
simply consists of having for each vertex a list of the neighbours, in clockwise 
order around the vertex, say, according to an embedding. A description of a 
layered cylindrical embedding consists of an ordering of the nodes in each layer, 
corresponding to their order around the cylinder in an embedding. 

We thus say that a family of planar circuits or cylindrical circuits are log- 
space-uniform, if there is a 0(log n) space bounded Turing machine which on 
input 1" outputs the description of the circuit on n inputs as well as the above 
defined description of the embedding. 

With these definitions it is not difficult to realize Proposition 5 in logspace 
and using this also obtain Theorem 3. More details on this will be given in the 
final version of this paper. 

6 Conclusion 

We have obtained a characterization of ACC° in terms of a geometrical re- 
striction of the digraph representation of circuits. Together with the previous 
characterizations of AC° this shows a striking similarity to the algebraic char- 
acterizations by Barrington and Therien, as summarized in the following table. 



Circuit Class 


AC*’ 


ACC 


NC^ 


Nonuniform Automata on Monoids 


Aperiodic 


Solvable 


Unrestricted 


Constant Width Branching Programs 


Upwards planar 


7 


Unrestricted 


Constant Width Circuits 


Upwards planar 


Planar 


Unrestricted 



It would be very interesting to further investigate the link between algebra and 
geometry in this setting. Intuitively, a cylindrical circuit correspond in some 
sense to an ACC° circuit with just one layer of MOD gates. One could hope to 
explain this by tightening this link. In [8] a H 2 o MOD o AC° lower bound and 
an ACC° upper bound was proved. Perhaps one could give a seemingly better 
upper bound than ACC°, for example AC° o MOD o AC°. 

We don’t have a characterization of ACC° in terms of geometric restric- 
tions of branching programs, yet they remain attractive for their simplicity. It 
might very well be within reach to obtain lower bounds for constant width pla- 
nar branching programs, providing a first step for employing planarity in lower 
bounds for ACC°. 
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Abstract. It is well known that the celebrated Lipton-Tarjan planar 
separation theorem, in a combination with a divide-and-conquer strategy 
leads to many complexity results for planar graph problems. For exam- 
ple, by using this approach, many planar graph problems can be solved 
in time , where n is the number of vertices. However, the constants 

hidden in big-Oh, usually are too large to claim the algorithms to be prac- 
tical even on graphs of moderate size. Here we introduce a new algorithm 
design paradigm for solving problems on planar graphs. The paradigm is 
so simple that it can be explained in any textbook on graph algorithms: 
Compute tree or branch decomposition of a planar graph and do dynamic 
programming. Surprisingly such a simple approach provides faster algo- 
rithms for many problems. For example. Independent Set on planar 
graphs can be solved in time n + n^) and Dominating Set in 

time _I_ ^4^ addition, significantly broader class of prob- 

lems can be attacked by this method. Thus with our approach. Longest 
cycle on planar graphs is solved in time -|- n'^) 

and Bisection is solved in time 0(2^'^^^'^n+n*). The proof of these re- 
sults is based on complicated combinatorial arguments that make strong 
use of results derived by the Graph Minors Theory. In particular we 
prove that branch- width of a planar graph is at most 2.122^/n. In addi- 
tion we observe how a similar approach can be used for solving different 
fixed parameter problems on planar graphs. We prove that our method 
provides the best so far exponential speed-up for fundamental problems 
on planar graphs like Vertex Cover, (Weighted) Dominating Set, 
and many others. 



1 Introduction 

The design of (exponential) algorithms that are significantly faster than exhaus- 
tive search is one of the basic approaches of coping with NP-hardness [17]. Nice 
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examples of fast exponential algorithms are Eppstein’s graph coloring algorithm 
[16] and the algorithm for 3-SAT [10]. For a good overview of the field see the 
recent survey written by Gerhard Woeginger [31]. 

It is well known that by making use of the well-known approach of Lipton & 
Tarjan [25] based on the celebrated planar separator theorem [24] one can obtain 
algorithms with time complexity for many problems on planar graphs. 

However, the constants “hidden” in 0{y/n) can be crucial for practical imple- 
mentations. During the last few years a lot of work has been done to compute 
and to improve the “hidden” constants [3,4]. In this paper we observe a general 
approach for obtaining sub-exponential time exact algorithms for many problems 
on planar graphs. Our approach is based on dynamic programming for graphs 
of bounded branch-width (tree- width) . Combining our upper bound for branch- 
width of planar graphs with this simple approach one can obtain exponential 
speed-up for many known algorithms for many different planar graph problems. 
Independent Set, Dominating Set, SAT, MIN-Bisection, Longest Cy- 
cle (Path) on planar graphs are just a few examples of such problems. 

Another field for implementation of our graph theoretical bounds is in the 
designing of parameterized algorithms. The last ten years were the evidence of 
rapid development of a new branch of computational complexity: Parameterized 
Complexity. (See the book of Downey & Fellows [15].) Roughly speaking, a 
parameterized problem with parameter k is fixed parameter tractable if it admits 
a solving algorithm with running time /(fc)|/|^. (Here / is a function depending 
only on k, [/[ is the length of the non parameterized part of the input and /? is a 
constant.) Typically, f{k) = is an exponential function for some constant k. 
However, it appears, that for a large variety of planar graph problems algorithms 
with growth of the form f{k) = are possible. During the last two years much 
attention was paid to the construction of algorithms with running time for 
different problems on planar graphs. The first paper on the subject was the 
paper by Alber et al. [1] describing an algorithm with running time 0(4®'^^'^^n) 
(which is approximately 0(2™'^n)) for the Planar Dominating Set problem. 
Different fixed parameter algorithms for solving problems on planar and related 
graphs are discussed in [4,23]. We observe that our technique can serve also as 
a simple unified approach for solving many parameterized problems on planar 
graphs in subexponential time. Again, our approach is based on combinatorial 
bounds on planar branch-width and tree-width and provides a better running 
time for such basic parameterized problem like Vertex Cover, Dominating 
Set and many others. 

The crucial part of our paper is devoted to the proof that such a simple 
approach guarantees better time bounds and here we use complicated combina- 
torial arguments coming from Robertson-Seymour’s Graph Minor Theory. More 
precisely, our proof is based on a new upper bound to the branch-width and 
the tree-width of planar graphs. Both these parameters where introduced (and 
served) as basic tools by Robertson and Seymour in their Graph Minors series 
of papers. Tree-width and branch-width are related parameters (See Theorem 1) 
and can be considered as measures of the “global connectivity” of a graph. More- 
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over, they appear to be of a major importance in algorithmic design as many 
NP-hard problems admit polynomial or even linear time solutions when their 
inputs are restricted to graphs of bounded tree-width or branch-width. This mo- 
tivated the search for graphs where these parameters are relatively small. In this 
direction, Alon, Seymour & Thomas proved in [5] that given a minor closed graph 
class Q, any n-vertex graph G in ^ has tree- width /branch- width As a 

consequence of this, any n-vertex planar graph G has tree- width/branch- width 
< umiy/n. 

We show that every n-vertex planar graph G has branch-width < 2.122-^n 
and tree-width < 3.182-yn. To our knowledge, this is the best known upper 
bound for the value of these parameters on planar graphs. To obtain the new 
upper bounds we use deep “dual” and “min-max” theorems from Graph Minors 
series papers of Robertson & Seymour. 

1.1 Previous Results and Our Contribution 

Computation of constants at and «{, such that for every planar graph on n 
vertices tw(G) < at\/^+0{l) andbw(G) < ab^/n+0{l) is of a great theoretical 
importance. In [5] Alon, Seymour & Thomas proved that any Kr-Toinor free 
graph on n vertices has tree-width< (Here is complete graph on r 

vertices.) Since no planar graph contains as a minor, we have that at(G) < 
61-5 < 14.697. 

The first objective of this paper is to reduce the constant ab to 2.122 (for the 
case of branch-width) and at to 3.182 (for the case of tree-width). 

Lipton & Tarjan [25] were first to observe the existence of time 
algorithms for several problems on planar graphs. However the constants hidden 
in big-Oh of the exponent make these algorithms unpractical. Later, a lot of work 
was done on computing and reducing these constants. The best known so far 
results can be found in [4], where generalizations and complicated improvement 
of Lipton-Tarjan (together with kernel reduction techniques) are used to obtain 
subexponential parameterized algorithms. 

Thus, for example, the approach suggested in [4] provides an 0(2® °’^ '/"n Inn) 
algorithm for Independent Set and an 0(2^® ®^'/"nlnn) algorithm for Dom- 
inating Set. 

Here we suggest a unified approach based on branch decompositions (see 
Section 2 for the definitions). Our algorithm is simple and is performed in two 
steps: First we compute the branch decomposition of a planar graph and then 
do dynamic programming on graphs of bounded branch- width. Optimal branch 
decomposition of a planar graph can be constructed in polynomial time by using 
the algorithm due to Seymour & Thomas (Sections 7 and 9 in [29]). (See also the 
results of Hicks [21] on implementations of Seymour & Thomas algorithm.) For 
graphs with n vertices this algorithm can be implemented in 0(n‘^) steps. And 
what is important for practical applications, there is no large hidden constants 
in the running time of this algorithm. As for the second stage, well known dy- 
namic programming algorithms on tree decompositions can be easily translated 
to branch decompositions. Using upper bounds for branch- width we prove that 
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our approach provides more ejjicient solutions for many well known problems 
on planar graphs. 

The following table summarize some known and new results on some prob- 
lems on planar graphs (for more problems see Section 3). 





Known results 


New results 


Planar Independent Set 


0{2^-°’’^ n \ nn ) [4] 


0(2^'^®^^n + n4) 


Planar Dominating Set 


0(2i®-®i^"nlnn) [4] 


0(25-043US„ 


Planar (A;, r)-CENTER 




0((2r + + 


Planar Longest Cygle 




Qf ^2^.29 s / ii { hin +0.94)^5/4: _ j _ ^4^ 


Planar Longest Path 




Q^22-29\/7r{ln 71-1-0.94)^5/4 ^4^ 


Planar Bisegtion 




0(23'4»2^n + ri4) 


Planar Weighted Dominating Set 




0(2®'3^^n + n4) 


Planar Perfect Code 




q(26.37US^ ^4) 


Planar Total Dominating Set 




0(2^'4^"n + M) 


Planar ft-coLORiNG 




0(2‘°s'*'2-12US/i„ 3/2 ^^4j 


Planar Kernel 




0(23'3WS„2 „4) 


Planar JL-covering 




0(2'>-“^'*n + n4) 



Similar approach works well also for parameterized problems. The next table 
summarize results on the most fundamental fixed parameter problems on planar 
graphs. (See [3] for an overview of the results on this subject.) We include the 
result from [18] because it is based on the main combinatorial result of this paper 
and is obtained by similar approach. 





Known results 


New results 


Planar fc- V ertex Cover 


0(24'^n) [3] 


0(24-5'^fc + k^ + kn) 


Planar /c-Dominating Set 


[23] 


0(24®-43'^fc-Lfc4 + n3)[18] 


Planar /c-Independent Set 


0(24'^n) [3] 


0 (fc 4 + -t n) 



Thus our approach provides exponential speedup for the main basic parame- 
terized problems. Our method is quite universal and can be implemented to ob- 
tain an exponential speed-up for many known algorithms for different problems 
with fixed parameters. Mention just a few parameterized versions of the following 
problems: Independent Dominating Set, Perfect Dominating Set, Per- 
fect Code, Weighted Dominating Set, Total Dominating Set, Edge 
Dominating Set, Face Cover, Vertex Feedback Set, Minimum Maxi- 
mal Matching, Clique Transversal Set, Disjoint Cycles, and Digraph 
Kernel. Another advantage of our results is that they apply not only on planar 
graphs but on different generalizations of planar graphs, e.g. K 3 3 -minor-free or 
Ks-minor-free graphs. 



2 Definitions 

All graphs in this paper are undirected, loop-less and, unless otherwise men- 
tioned, they may have multiple edges. 

Tree-width and branch-width. A tree decomposition of a graph G is a pair 
{{Xi I i G V{T)},T), where {W | i G V{T)} is a collection of subsets of 
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V{G) and T is a tree, such that (1) UiGV(T) (2) for each edge 

{u,w} G E{G), there is an i G V{T) such that v,w £ Xi, and (3) for each 
V GV (G) the set of nodes {i\v G X^} forms a subtree of T. 

The width of a tree decomposition {{X^ \ i G V(T)},T) equals 

maXjgy( 7 ’)(|Xj| — 1). The tree-width of a graph G, tw(G), is the minimum width 
over all tree decompositions of G. 

A branch decomposition of a graph (or a hyper-graph) G is a pair (T, r), where 
T is a tree with vertices of degree 1 or 3 and r is a bijection from the set of leaves 
of T to E{G). The order of an edge e in T is the number of vertices v G V(G) 
such that there are leaves ti,t 2 in T in different components of T(l^ (T), E{T) — e) 
with r(ti) and T(t 2 ) both containing u as an endpoint. 

The width of (T, r) is the maximum order over all edges of T, and the branch- 
width of G, bw(G), is the minimum width over all branch decompositions of G. 

It is easy to see that if i? is a subgraph of G then hw{H) < bw(G). The 
following result is due to Robertson & Seymour [(5.1) in [26]]. 

Theorem 1 ([26]). For any connected graph G where \E{G)\ > 3, bw(G) < 
tw(G) -I- 1 < |bw(G). 

From Theorem 1, any upper bound on tree-width implies an upper bound 
on branch-width and vice versa. 

Planar graphs: slopes and majorities. In this paper we use the expression E- 
plane graph for any planar graph drawn in the sphere S. To simplify notations 
we do not distinguish between a vertex of a A-plane graph and the point of 
E used in the drawing to represent the vertex or between an edge and the 
open line segment representing it. We also consider G as the union of the points 
corresponding to its vertices and edges. That way, a subgraph H of G can be seen 
as a graph FI where H Q G. We call by region of G any connected component of 
E — E{G) — V {G) . (Every region is an open set.) We use the notation V{G),E{G), 
and R{G) for the set of the vertices, edges and regions of G. A path of G is any 
connected subgraph P of G with two vertices of degree 1 (we call them extremes) 
and all other vertices (we call them internal) of degree 2. A sub-path of a path 
P is any path P' C P. A cycle of G is any connected subgraph G of G with 
all the vertices of degree 2. The length jG] (\P\) of a cycle G (path jPj) is the 
number of its edges. 

If Z\ C A, then A denotes the closure of A, and the boundary of A is 
bd(Z\) = A D E — A. An edge e (a vertex v) is incident with a region r if 
e C bd(r) (v C bd(r)). 

We call a A-plane graph G triangulated if all of its regions are triangles, 
i.e. for every region r, bd(r) is a cycle of three edges and three vertices. Given 
a region r of a triangulated graph G we call the cycle bd(r) triangle of G. A 
triangulation P of a A-plane graph G is any triangulated A-plane graph H 
where G C H. Notice that any A-plane graph with all regions of size > 3 has a 
triangulation. A triangle of a triangulated A-plane graph G is a regional triangle 
if it bounds a region of G. 

Let G be a A-plane graph. A subset of E meeting the drawing only in vertices 
of G is called G-normal. A subset of E homeomorphic to the closed interval [0, 1] 




A Simple and Fast Approach for Solving Problems on Planar Graphs 



61 



is called I -arc. If the extreme points of a G-normal /-arc L are both vertices of 
G then we call it line of G. If a simple closed curve F G S \s G-normal then we 
call it noose. 

The length of a line is the number of its vertices minus 1 and the length of a 
noose is the number of its vertices. We denote by |fV| (|T|) the length of a noose 
N (line L). Z\ C is an open disc if it is homeomorphic to {(x, y) : x^-\-y^ < 1}. 
We say that a disc D is bounded by a noose TV if TV = bd(/l). From the theorem 
of Jordan, any noose N bounds exactly two closed discs Z\i,Z \2 in S where 
Z\i n Z \2 = TV. We call 0-structure S = {Li, L2, L^) of G the union of three 
mutually touching lines. If for i,j, I < z < j < 3 the noose Li U Lj has size 
< k then we say that S' is a G-structure of length < k. We call a 0-structure 
non-trivial if at least two of its lines have length > 2. We call the 6 closed discs 
bounded by the nooses LiULj,l<i<j<3 closed discs hounded by S. 

The radial graph of a iT-plane graph G is the bipartite /T-plane graph Rq 
obtained by selecting a point in every region r of G and connecting it to every 
vertex of G incident to that region. We call the vertices of Rq that are not 
vertices of G radial vertices. 

Slopes and majorities are important tools for improving upper bounds. 
Slopes (Robertson Sz Seymour [27]). Let G be a /7-plane graph and let / > 1 
be an integer. A slope in G of order fc/2 is a function ins which assigns to every 
cycle G of G of length < k one of the two closed discs ins(G) C S bounded by 
G such that 

[51] If G, G' are cycles of length < k and G C ins(G') then ins(G) C ins(G'). 

[52] If Pi,P2, P3 are three paths of G joining the same pair u, v of distinct 
vertices but otherwise disjoint, and the three cycles Pi U F 2 , Pi U P 3 , P 2 U P 3 all 
have length < k then ins(Pi U P 2 ) U ins(Pi U P 3 ) U ins(P 2 U P 3 ) yf S. 

A slope is uniform if for every region r G P(G) there is a cycle G of G of 
length < k such that r C ins(G). 

We need the following deep result proved in the Graph Minors papers by 
Robertson & Seymour. This result follows from Theorems (6.1) and (6.5) in [27] 
and Theorem (4.3) in [26]. (See also Theorems (6.2) and (7.1) in [29].) 

Theorem 2 ([27]). Let G he a connected E-plane graph where |P(G)| > 2 and 
let k > 1 be an integer. The radial drawing Rq has a uniform slope of order > k 
if and only if G has branch-width > k. 

Majorities (Alon, Seymour & Thomas [6]). Let G be a /7-plane graph and let 
A: > 0 be an integer. A majority of order k is a function big that assigns to every 
noose TV of length < k a closed disc big(TV) C E bounded by TV such that 

[Ml] If Pi, P 2 , P 3 is a 0-structure of G with length < k and P 3 C big(Pi U 
P 2 ), then big(Pi U P 3 ) C big(Pi U P 2 ) or big(P 2 U P 3 ) C big(Pi U P 2 ). 

[M2] If TV is a noose of length < min(2, k) then either big(TV) — TV contains 
a vertex or big(TV) includes at least two edges of G. 

The following result gives an upper bound on the order of a majority (state- 
ment (3.7) of [ 6 ]). This is a basic ingredient of our bound for the branch- width 
of planar graphs. 
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Theorem 3 ([6]). Any majority of a S -plane graph G has order < 

V4.5-mG)|-l. 

Our bounds on branch-width and tree-width follows from the following the- 
orem that is the main combinatorial result of the paper. 

Theorem 4. Let G, |V^(G)| > 5, be a triangulated U -plane graph without mul- 
tiple edges, drawn in S along with its radial graph and let k >2 he an integer. 
If there exists a uniform slope of order k -\- 1 in Rq then G contains a majority 
of order k. 

The proof of Theorem 4 is rather long and technical. Due to space restrictions 
we sketch here the main ideas of the proof. 

Sketch of the proof of Theorem j. We want to correspond nooses of G to cycles 
of Rg and try to translate the slope axioms to majority axioms. Corresponding 
nooses to cycles is not direct as not every noose is a cycle of the radial graph. 
To overcome this problem we need to work with “classes” of similar structures. 

Let G be a 27-plane graph without loops or multiple edges and let S C 
S be an /-arc (simple closed curve) in S. We use the notation kg{S) = 
{vi, . . . , V|sny(G)|) for the ordering (cyclic ordering) of the vertex set S fl V{G) 
that represents the way the vertices of G are met by S. Notice that k can be ap- 
plied to both cycles and nooses but also to paths and lines. Especially for cycles 
and paths of graphs without multiple edges, we can directly represent them with 
the output of the function k (we will use the same notation for a cycle/path and 
the (cyclic) ordering of the vertices that it meets). 

Let S be one of the following structures in G: a noose, a line, or a G-structure. 
A variation of S is the operation that transforms S' to a structure S' of the same 
type in a way that dif (S, S') := (S U S') — (S fl S') is a noose of size 2 where one 
of the closed discs D it bounds has the following two properties: (1) D — bd(D) 
contains no vertices of G, (2) D contains at most one edge of G. 

If two structures Si and S 2 are variations each of the other, we denote it as 
Si ~ S 2 . If a structure S' is the result of a finite number of consecutive variations 
with S as starting point, we call S' vibration of S and we denote this fact as 
S S'. Notice that if S S' then V{G) n S = V{G) n S' and S, S' have the 
same length. 

The importance of vibrations is that in a triangulated A-plane graph without 
multiple edges every noose is a vibration of a cycle of the radial graph. This fact 
is intuitively clear but needs a technical proof. 

Let ins be a uniform slope of order / -|- 1 in Rg- To construct a majority we 
need to define the function big. Every noose in A of size < fc is a vibration of 
a cycle G in Rg and the length of G is < 2k. Cycle G is also a noose in S and 
because G and N are vibrations of each other, they “separate” the same vertex 
sets in G. In other words, if ins(G), S — ins(G) are closed discs bounded by G 
then for one of the closed discs D bounded by N, we have that D C\V (G) = 
S — ins(G) n V{G). We define big(N) = D. 

The proof of the fact that the function big defined via ins satisfies majority 
axioms is quite technical. It uses some results about vibrations of 0-structures 
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and requires a series of auxiliary results assuring that the basic topological prop- 
erties involved in the majority axioms are invariants under vibrations. 

Theorem 4 implies our main combinatorial result. 

Theorem 5. For any planar graph G, bw(G) < i/4.5|F(G)| < 2.122^/\V{G)\. 

Proof. We assume that G has no multiple edges (notice that the duplication of 
an edge does not increase the branch- width of a graph with branch-width > 2). 
It is easy to see that G has a triangulation H without multiple edges. It is enough 
to prove the bound of the theorem for H . By Theorem 3, H does not have any 
majority of order > {?>/^/2)^J\V{G)\. By Theorem 4, Rh has no slope of order 
> {“if ^/2)^J\V{G)\ + 1. The result now follows from Theorem 2. □ 

Since 9/(2-\/2) < 3.182, Theorems 1 and 5 imply that for any planar graph 
G, tw(G) < 3.182 a/|I^(G)|. In the next section examine the algorithmic conse- 
quences of our combinatorial bounds. 



3 Algorithmic Consequences 

In this section we discuss some applications of our results for different prob- 
lems on planar graphs. The following simple theorem is the source for obtaining 
subexponential algorithms for many graph problems. 

Theorem 6. Let II he an optimization problem that is solvable on graphs of 
branch-width < £ in time f{£)g{n). Then on planar graphs problem II is solvable 
in time 0{f{2.122^/ri)g{n) n"*) 

Proof. First we compute an optimal branch decomposition of planar graph. To 
compute an optimal branch decomposition of a planar graph one can use the 
algorithm due to Seymour & Thomas (Sections 7 and 9 in [29]). This algorithm 
can be implemented in 0{n'^) steps. Then Theorem 5 implies the proof. □ 



Corollary 1. Let LI he an optimization problem that is solvable on graphs of 
branch- width /tree- width < £ in time 2°^^ ^poly(n, £) . Then on planar graphs prob- 
lem n is solvable in subexponential time (in 2°^”^ steps). 

In spite of its simplicity, Theorem 6 provides a general framework for obtain- 
ing subexponential algorithms for a broad range of problems. And the only thing 
one needs to know to estimate the running time of the algorithm is how fast a 
problem can be solved on graphs of bounded branch-width/tree-width^. But re- 
ally surprising is that such a trivial approach provides better time estimation 
than many, complicated to analyze, algorithms based on separator theorems. 

^ Any algorithm solving a problem on graphs of tree-width < £ m time f{£)g{n) can 
be translated to an algorithm for graphs of branch-width < I with running time 
0(f(3/2£)g{n) -\- m) where m is the number of edges of the input graph. 
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Let us give just few examples. It is well known that on graphs of tree-width I 
Independent Set can be solved in time 0{2^n) and hence on graphs of branch- 
width < ^ it can be solved in time Thus by Theorem 6 we obtain that 

Independent Set on planar graphs is solvable in Dominat- 
ing SET on graphs of branch- width < £is solvable is time [11]. Thus 

on planar graphs, Dominating set is solvable in 0{2^'^'^^'^n + n‘^). Similar ar- 
guments, based on the algorithms in [1], work for the planar versions of different 
variations of the Dominating set problem like Independent Dominating 
Set, Perfect Dominating Set, Perfect Code, Weighted Dominating 
Set, Red Blue Dominating Set where the time is 0(2®-^^'/”n-|-n^), and for 
Total Dominating Set and Total Perfect Dominating Set where the 
time is 0(2^ ‘*'/”n -I- n"*). 

Longest cycle and Longest path problems on graphs of tree-width £ are 
solved in 0{£l2^n) time [7] implying an (9(22-29>/"(in«+o.94)j^5/4 _|_ ^.^4^ algorithm 
on planar graphs. MIN-Bisection is solvable in 0(2^n) [22] on graphs of tree- 
width £ and the planar version of the problem is solvable in 0(2^- -k 

n^). In [19], Gutin et al. gave a time 0{3^kn) algorithm for finding a kernel 
of size fc in a digraph whose underlying graph has treewidth at most £. This 
implies that Kernel is solvable in 0{2^'^'^'^n^+n'^). The TL-coloring problem 
is solvable in 0{h^^^£n) on graphs of tree-width £ [13], therefore its planar 
version is solvable in time + n^). H-Cover is solvable in 

time 0{n2^^^) [30] on graphs of tree- width < £ and thus for planar graphs in time 
0{2^'^‘^^'^^n+n‘^). Finally, (fc,r)-CENTER is solvable in time 0((2r-|- 1) on 
graphs of branch- width < £ [11] providing an 0((2r-|- l)^’^®^'/"n-|-n'^) algorithm 
for the planar version of the problem. 

More generally, almost every natural problem expressible in MSOL is solvable 
in time or 0{£\c^n^^^^), and by Corollary 1 is solvable in 

subexponential time on planar graphs. Examples of such problems where c is a 
small constant are Vertex Feedback Set, Disjoint Cycles, Face Cover. 
Edge Dominating Set, Clique Transversal, and Maximal Matching 
(see [8,12]). For all these problems Corollary 1 provides subexponential algo- 
rithms with small hidden constants. 

Actually, one can further strengthen the conditions of Corollary 1 towards ex- 
tending the framework where subexponential algorithms are possible. Indeed, it 
is enough to have a time (poly(£, n))°^^ ^ algorithm for the problem U for graphs 
of treewidth/branchwidth at most £. Notice that such problems are not neces- 
sarily expressible in MSOL. As an example we mention the problems of finding 
a non-preemptive multicoloring with minimum sum/makespan. These problems 
can be solved in time 0{n- {£plog n)^+^) for graphs with tree- width < £ (see [20]). 
Therefore, they can be solved in time 0(pT^^/^logn•2^■^®■*°splog"loglogn^/n_|_J.J4^ 
on planar graphs. 

Similar ideas work for parameterized problems. Let £ be a parameterized 
problem, i.e. £ consists of pairs (/, k) where k is the parameter of the problem. 
Reduction to linear problem kernel is the replacement of problem inputs (/, k) by 
a reduced problem with inputs {!' ,k') (linear kernel) with constants ci,C 2 such 
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that k' < c\k, |/'| < 02 k' and {I,k) G £ {I',k') G £. (We refer to Downey 

& Fellows [15] for discussions on fixed parameter tractability and the ways of 
constructing kernels.) 

Theorem 7. Let L he a parameterized problem (/, k) (here I can be a graph, 
hypergraph or matroid) such that 

— There is a linear problem kernel computable in time Tfcerneid^l) con- 

stants Cl, C2 and such that an optimal branch decomposition of the kernel is 
computable in time Tbw{\I'\). 

— On graphs (hypergraphs, matroids) of branch-width < I and ground set of size 
n the problem C can he solved in 0(2°^^n), where C3 is a constant. 

— bw(/') < CiVk, where C4 is a constant. Then C can he solved in time 

0(2=aC4V^fc + TbU\I'\) + Tkernel{\I\,k)). 

Proof. The algorithm works as follows. First we compute a linear kernel in 
time Tkernei{\I\, k). Then we construct a branch decomposition of the kernel in 
Tbw{\I'\) steps. The size of the kernel is at most ciC 2 k = 0{k). The branch-width 
of the kernel is at most C 4 ^/k and it takes 0{2^^^'^'^k-\-Tbw{\I'\) + Tkernei{\I\,k)) 
to solve the problem. □ 

Let us give some examples, where Theorem 7 provides proven better bounds 
for different parameterized problems. 

The Planar /c-Vertex Cover problem is the task to compute, given a pla- 
nar graph G and a positive integer k, a vertex cover of size k or to report that no 
such a set exists. A linear problem kernel of size 2k (with constants Ci = 1 and 
C2 = 2) for the /c-Vertex Cover problem (not necessary planar) was obtained 
by Chen et al. [9]. The running time of the algorithm constructing a kernel of a 
graph on n vertices is 0{kn k^). So in this case Tkemei{\I\, k) = 0{kn k^). 
It is well known that the Vertex Cover problem on graphs on n vertices and 
with bounded tree-width < i can be solved in 0(2^n) time. The dynamic pro- 
gramming algorithm for the Vertex Cover on graphs with bounded tree- width 
can be easy translated to the dynamic programming algorithm for graphs with 
bounded branch-width with running time 0(2^/^^m), where m is the number of 
edges in a graph, and we omit it here. For planar graphs 2^/^^m = 0(2^/^^n), 
thus C3 < 3/2. 

From the constructions used in the reduction algorithm of Chen et al. [9] 
it follows that if G is a planar graph then the kernel graph is also planar. To 
compute an optimal branch decomposition of a planar graph one can use the 
algorithm due to Seymour & Thomas [29]. This algorithm (applied to the kernel 
graph) can be implemented in O(fc^) steps. The kernel graph I' has at most 
2k vertices. Then by Theorem 5, C4 < -\/4.5-\/2 = 3. Thus by making use of 
Theorem 7, we conclude that Planar ^-Vertex Cover can be solved in 0{k'^-\- 
2^-^'^k + kn). 

A k- dominating set D of a graph G is a set of k vertices such that every 
vertex outside D is adjacent to a vertex of D. The Planar fc-DoMiNATiNG Set 
problem is the task to compute, given a planar graph G and a positive integer 
k, a /c-dominating set or to report that no such a set exists. 
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Alber, Fellows & Niedermeier [2] show that the Planar Dominating Set 
problem admits a linear problem kernel. (The size of the kernel is 335fc.) This 
reduction can be performed in O(n^) time. Dominating Set problem on graphs 
of branch-width < I can be solved in steps [18]. Thus C 3 < 31og43. 

It is proved in [18] that for every planar graph G with dominating set fc, the 
branch-width of G is at most 3-\/4~5'\/fc, i.e. C 4 < 3-\/4.5. Then by Theorem 7, 
Planar Dominating set can be solved in 

Other problems and generalizations. Our ideas can be adapted to different 
problems by using the bounds and tree- width (branch-width) based algorithms in 
the same fashion as it is done in [1,3,8,12] . That way, our upper bound implies the 
construction of faster algorithms for a series of problems when their inputs are 
restricted to planar graphs. As a sample we mention parameterized versions of 
the following problems: Independent Dominating Set, Perfect Dominat- 
ing Set, Perfect Code, Weighted Dominating Set, Total Dominating 
Set, Edge Dominating Set, Face Cover, Vertex Feedback Set, Min- 
imum Maximal Matching, Clique Transversal Set, Disjoint Cycles, 
and Digraph Kernel (see [1,3,8,12] for the exact definitions). 

Finally let us note that our upper bound for treewidth holds not only on 
planar graphs but on different generalizations of planar graphs. This follows 
directly from the results of [ 12 ] and implies an exponential speed-up of all the 
aforementioned problems on certain classes of non-planar graphs such as K 3 3 - 
minor-free or ATs-minor-free graphs. 
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Abstract. The question whether the preemptive Sum Multicoloring 
(pSMC) problem is hard on paths was raised by Halldorsson et al. in 
[8]. The pSMC problem is a scheduling problem where the pairwise con- 
flicting jobs are represented by a conflict graph, and the time lengths of 
jobs by integer weights on the nodes. The goal is to schedule the jobs so 
that the sum of their finishing times is minimized. In the paper we give 
an 0{n^p) time algorithm for the pSMC problem on paths, where n is 
the number of nodes and p is the largest time length. The result easily 
carries over to cycles. 



1 Introduction 

In scheduling problems there is a set of jobs, each with a given time length. 
There might be conflicts between jobs, then they cannot be worked on at the 
same time, due to e.g. some non-sharable resource they use. Real-life situations 
of this kind can be found in operating systems, in areas like traffic intersection 
control, frequency assignment for mobile phones, VLSI routing etc. (see [7]). 

In the mathematical model the jobs are represented as nodes of a simple 
undirected graph G = (V,E), where two nodes representing conflicting jobs 
are connected by an edge. The demand of u G V is a positive integer x(v) 
modeling the number of time units needed to finish the job of v. A proper 
schedule : V — >• 2^ of the jobs assigns a set E(v) of positive integers to 

each V G V s.t. |'f'(u)| = x{v) and the sets assigned to adjacent vertices do 
not intersect (i.e., they are never scheduled at the same time). In this way the 
scheduling problem becomes a graph coloring problem if x{v) = 1 for each 
V G V, and graph multicoloring problem in the general case. (The name stems 
from regarding the E(y) as sets of colors. In the paper, however, we continue to 
view the problem as a scheduling problem since we will use colors for something 
else.) 

A traditional optimization goal is to minimize the overall finishing time, re- 
spectively the number of colors used to color all the vertices. Another reasonable 
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Site programme of the EU. 
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goal is to minimize the average finishing time of the jobs. That is, if f{v) denotes 
the largest integer assigned to v, we search for a schedule (multicoloring), such 
that minimum over all proper schedules. The latter is called the 

sum multicoloring (SMC) problem. 

In the paper we consider preemptive scheduling, where the tf'(u) are arbitrary 
sets of positive integers (pSMC problem). There has been much related work 
done concerning the non-preemptive SMC (npSMC) problem, where the assigned 
'f'(v) sets must be contiguous, see e.g. [7,8]. 

Our result. The question, if the pSMC problem is hard on paths, was raised as 
an open problem by Halldorsson et al. in [8] . In this paper we provide a pseudo- 
polynomial algorithm for the problem. Let G = (V, E) be a path, \V\ = n, and 
p = max„gy x{v). Our algorithm takes 0{n^p) time. It is based on a technique 
that is interesting in its own right. With minor modifications the approach can 
be applied to the pSMC problem on cycles. 

Related work. Here we just mention the most relevant results. For a more com- 
prehensive history of the SMC and related problems see, e.g. [2,8]. 

The sum coloring problem, the special case of SMC with unit time require- 
ments, was first raised by Kubicka in [4], where a polynomial algorithm was 
given for the case of trees. The sum coloring problem is NP-hard even on bipar- 
tite graphs [1], interval graphs [9], planar graphs [2], and line graphs [6]. These 
results imply the hardness of the corresponding SMC problems. 

The general SMC problem was introduced by Bar-Noy et al. [7] within a 
comprehensive study on the approximability of both the pSMC and npSMC on 
different graph classes. 

In [8] two efficient algorithms are provided for the non-preemptive (npSMC) 
problem on trees. They run in 0{n^) and in 0{np) time, respectively. On paths 
the first one runs in 0(nlogp/loglogp) time. For the preemptive (pSMC) prob- 
lem on trees, a polynomial time approximation scheme is given. 

Marx proved the hardness of the pSMC problem on trees in [5]. He has 
shown that pSMC is NP-hard even on binary trees, even when p is polynomially 
bounded. Thus, the SMC problem on trees turned out to be one of the few 
scheduling-type of problems in which the preemptive version is essentially harder 
than the non-preemptive version. It is natural to go on asking, on which graph- 
classes pSMC is efficiently solvable. In [8] the question is raised, whether pSMC 
is hard on paths. For this problem, an algorithm polynomial in n and p is given in 
this paper. It can serve as a first step towards characterizing these graph-classes. 

Overview. Section 2 describes the basic notation we use and establishes a few 
elementary facts. In Section 3 we give the ingredients of a pseudo-polynomial 
algorithm. Section 4 contains the details of an improved algorithm of 0{n^p) 
steps. Unfortunately we could not get rid of the factor p, but we conclude by 
sketching a further improvement in the exponent of n and the modifications for 
the case when the graph is not a path but a cycle. 
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2 Notation, Definitions, and Basic Facts 

The nodes in the path are numbered from left to right by 1, . . . , n. Hi < j, we 
denote the subpath of starting node i and ending node j by [i,j]. If z is a node, 
then x{i) G N+, is the demand of i. Let p = maxi<i<„ x{i) be the largest demand. 

C N is the set of numbers or time units assigned to z in schedule 'P. f^{i) 
denotes the finishing time (the largest number assigned to z). We simply write 
/(z) when P is clear from the context. We also use /(z,j) max(/(z), /(j)). 

We add nodes 0 and rz + 1 to the path with demands a:(0) = x{n + 1) = 0. 

Definition 1. We call P a (proper) schedule, if \P{i)\ = x{i) and P{i) fl P{i + 
1) = 0 {1 < i < n — 1). P is an optimal schedule, ifJ27=i minimum 

over all schedules. P is a square-optimal schedule, if it is optimal, and the sum 
Tnaximum over all optimal schedules. 

Intuitively, in a square-optimal schedule small /(z) values are as small as 
possible and large /(z) are as large as possible. 

We will give a pseudo-polynomial algorithm (polynomial in n and p) that 
computes an optimal schedule, for given demands on nodes of a path. 

Definition 2. Given a schedule P, we say that node i is 

a local minimum, z/z = 0, or z = rz -I- 1, or f{i — 1) > /(z) < /(z -I- 1); 
a local maximum, if f{i — 1) < /(z) > /(z -I- 1); 

a stair otherwise, in particular a stair-up, if f{i — 1) < /(z) < f{i+ 1), and 
a stair-down, if f{i — 1) > /(z) > /(z + 1); 
compact, if f{i) = x{i). 

Let us use the following visualizing expressions. We say that z is black on 
level a, if a G P{i), and z is white on level a if a ^ <f'(z) (see Fig. 1). For example, 
if z is compact then it is not white on any level under /(z). 

For i < j we will also say, that the ordered pair (z,j) is black-white, black- 
black,... etc. on level a. Note that (z,z-l- 1) cannot be black-black on any level. 
The following is easy to see: 

Proposition 1. In an optimal schedule the number of levels where (i,j) is black- 
black is at most p. The same holds for white-black and for black-white levels. The 
number of levels under f{i,j) where (i,j) is white-white, is at most 2p. 



Definition 3. An (i,j) pair is conflicting on level a, if either 
z = j(mod2) and (i,j) is black-white, or white-black, or 
i ^ j(mod2) and (i,j) is black-black, or white-white. 



Proposition 2. If (i,j) is conflicting on level a, then 3k G [i,j — 1] such that 
{k, fc -b 1) is white-white on level a. 
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Definition 4. Suppose that \/k G [hj], f{k) > max(o, 5) in a schedule W. We 
say that we change the levels a and b on [t, j], z/Vfc G [i,j] we make k white 
(black) on level a if and only if according to it was white (black) on level b, 
and we make k white (black) on level b, if and only if it was white (black) on 
level a. 

After carrying out this operation, we may have to make corrections to get a 
proper schedule again. Note that we will have to check the pairs (i — l,i) and 
{j,j -I- 1) on the levels a and b. 

Proposition 3. Ifi is a stair-up in a square- optimal schedule 'P, then {i — l,i) 
is either white-black or black-white on any level a < f{i). A symmetric statement 
holds if i is a stair-down. 

Proof. We need to show that (i — l,i) is not white-white on any level below 
f{i). Suppose (i — l,i) is white-white on level a < f{i). Let M be the first local 
maximum to the right of i, and let’s change the levels a and f(i) on [i,M], 
Now we decreased f{i) by at least one, and got a proper schedule on [z — l,i], 
since z — 1 is white on the level a. If it is a proper schedule on [M, M -|- 1], then 
we decreased the optimum sum, contradiction; if it is not a proper schedule on 
[M, M then we make M white either on the level a or on the level /(z), and 
make it black on the level f{M) -\- 1 (increase f{M) by one). Now we created 
another optimal schedule, but increased the sum of squares of finishing times, 
again a contradiction since 'P was square-optimal. □ 

Corollary 1. In a square- optimal schedule let i be a local minimum and M 
the first local maximum to the right of i, and let i < k < M. Then f{k) = 
x{k — 1) -I- x{k) holds, and T{k) is determined above the level f{i). Moreover, 
T{i) determines the whole T{k). A symmetric statement holds for stair-downs. 

Proposition 4 is also based on a simple level-changing argument (see [3]): 
Proposition 4. For any local minimum k in an optimal schedule, f{k) < 3x{k). 



3 An Outline of the Algorithm and Further Definitions 

Definition 5. With respect to a given schedule W, let i < j be both local minima 
with the property that if i < k < j is a local minimum between them, then 
f{k) > max(/(z), /(j)). We will say that such an (i,j) pair is a convex pair. 

Definition 6. For a convex pair (i,j), in a schedule F, let ]jit{i,j) = k if 
i < k < j and k is a local minimum, and 

if i < k' < j and k' is a local minimum, then f{k) < f{k') or {f{k) = f{k') 
and k < k'). 

If there is no local minimum between i and j, then let pit{i,j) = 0 and let 
top{i,j) denote the unique local maximum between i and j. 




72 



A. Kovacs 




i= 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 

x(i)= 0 1 2 5 10 50 99 4 99 50 12 50 10 5 2 1 0 

Fig. 1. A square-optimal solution for the given pSMC problem on a path of length 
15. The convex pairs are (0,16), (0,7), (7,16), (7,10), and (10,16). Other examples: 
pit{7, 16) = 10; top{7, 10) = 8; £(10) = 7, and r(10) = 13. 



Note that pit{i,j) is the (leftmost) local minimum of smallest finishing time 
between i and j, and both j)) and {pit{i , j) , j) are convex pairs. 

The algorithm tests for every pair (i,j), whether it can be a convex pair in 
a square-optimal schedule. It proceeds from short distances to long ones - i.e., 
first it tests for each pair of the form (f, i + 2), and at last (0, n -I- 1). It proceeds 
dynamically by testing for each k between i and j, if it can be pit{i,j), or top{i,j). 
We will add up the computed optimum on [i -|- 1, fc — 1], and on [fc -|- 1, j — 1], 
and /(fc), to obtain a possible sum of finishing times on [i -|- 1, j — 1]. 

Of course, an optimum that we calculated this way is only realizable if we 
can indeed ’glue’ two optimal schedules by the schedule of k. Thus, we will also 
have to one by one fix and test some characteristic of the schedule on z, j and k. 

For the time being, let us suppose that if fc = pit{i,j) then f{k) < f{i+l) and 
/(fc) < /(j — 1). That is, we disregard the fact that there might be several stairs 
on the two sides of [z, j], having finishing time smaller than /(fc), or stair-downs 
finishing under f{i) if, e.g., /(z) > /(j) (see (i,k,j) = (7,10,16) on Fig. 1). 

We will fix the number of black-black, white-black, black-white, and white- 
white levels concerning the (i,j) pair. Since we just fix the number, and not the 
location of these levels, even testing for all 0{p'^) possibilities would result in a 
pseudo-polynomial algorithm. 

Definition 7. For a convex pair (i,j) in a fixed schedule F, let C{i,j) G [0, 2p]^ 
be the 4^-tuple denoting the number of levels under f{i,j), where (i,j) is black- 
black, white-black, black-white, and white- white, respectively. We will call C{i,j) 
the scheme of (z, j). For the triple i < k = pit{i,j) < j, we will talk about the 
scheme C{i,k,j) G [0,2p]® under the level f{k), in the same sense. 
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We denote by C{i,k,j) C{i,j), when they are consistent with each other, 
i.e., the number of black-white-black and the number of black-black-black levels of 
(i,k,j) sum up to the number of black-black levels of{i,j), and so on. Similarly, 
we use the notations C{i,k,j) C{i,k) and C{i,k,j) C{k,j). 



Remark 1. Note that the four numbers in C{i,j) sum up to f{i,j) and the 
eight numbers in C{i,k,j) sum up to f{k). Note also, that C{i,k,j) C{i,j) 
implies that the number of white-white-white plus the number of white-black- 
white levels in C{i,k,j) equals the number of white-white levels in C(i,j) plus 
f{k) — f{i,j). Here is where we implicitly exploit that (i,j) is supposed to be 
a convex pair, and therefore f{k) > f{i,j) : for a fixed C(i,j) we will want 
to test for all possible C{i,k,j), for which C{i,k,j) C{i,j) holds. Suppose, 
we wanted to calculate the optimum by taking also f{k) < f{i,j) values into 
consideration. For this calculation we would need in advance the scheme of (i, j) 
under each possible f{k) level. On the other hand, knowing C{i,j) under all of 
the levels, boils down to having !F(j) and 'F(j), and that is exactly what we tried 
to avoid, as testing all that, would lead beyond polynomial time. 

For each pair i < j and each possible scheme C{i,j) of the pair, the algorithm 
computes an optimum sum of finishing times F{i,j, C{i,j)) = /(O sup- 

posing that (i,j) is a convex pair. In particular, we test for each k € [f-l- 1, j — 1] 
to be pit{i,j) and for each scheme C{i, k,j) C{i,j). 

For C{i,k,j) C{i,k) and C{i,k,j) C{k,j) we have a previously com- 
puted F(i, A:, (^(f, /c)) andaF(A:,j, C{k,j)) value. We obtain the sum of finishing 
times for this k and C{i, k,j) by F{i, k, c{i, k))-\-F{k,j, C {k , j)) -\- f {k) . We also 
test if k can be top{i,j). In that case the sum of finishing times is computed 
easily based on Corollary 1 and on C{i,j). Finally we choose the smallest sum 
of finishing times to be F(i,j,C(i,j)) and remember one k = pit(i,j) and one 
C{i,k,j), or a, k = top{i,j) that yielded this value. In the end of this process 
we will have all the local minima and maxima in an optimal schedule and the 
schemes of convex pairs of minima. This is sufficient information to create such 
a schedule, starting with a minimum of smallest finishing time. 

The algorithm is still a bit more complicated than this, because we may have 
stairs on the two sides of [i,j] that we did not yet consider. Before we elaborate 
on this we shall need one more definition: 

Definition 8. Let i be a local minimum in a schedule F. Let i < r{i) <n-\-l be 
the node with the property f{r{i)) < f{i) and \/k,i < k < r(z) /(fc) > /(f). We 
define 0 < £{i) < i symmetrically. 

Now, with these terms in hand, suppose we are testing the pair (z,j) and 
e.g. f{i) > f{j) holds. Then we will need C{i,r{i)) instead of C{i,j) and if 
f{k) > f{i), then we will need C{£{k), k,r{k)) instead of C{i,k,j) (see Fig. 1). 
This won’t make so much difference, because [i -\- l,£{k)] consists of stair-ups 
as well as [r{k),j — 1] consists of stair-downs only, and there the schedule and 
finishing times have a simple one-to-one correspondance to those of i and j. Let’s 
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say that all of i < £{k) < k < r{k) < r{i) < j are fixed. The schemes C{i,r{i)) 
and C{£{k),k,r{k)), and the consisteney C{£{k),k,r{k)) C{i,r{i)) can be 

defined like before. In the latter, now we have to correct all the four values in 
C{i,r{i)) with the appropriate values that sum up to f{k) — f{i) (not only the 
white- white value, see Remark 1). 

The values etc. may carry an appended subscript 

in order to stress that they are relative to a schedule !?', or a subscript a when a 
value was yielded by the algorithm and is not necessarily realized by a schedule. 

4 An 0{ri^p) Algorithm 

In Section 3 the main idea of the algorithm was presented. Turning to the exact 
description we will make use of two technical theorems - Theorems 1 and 2 - that 
will help us simplify the algorithm and improve its performance. The detailed 
proofs omitted from this section can be found in [3]. 

Here we present an 0{n^p) time algorithm. In the conclusions we will give a 
short argument on how to reduce its time bound to 0{n^p). 

Theorems 1 and 2 are based on the following lemmas: 

Lemma 1. Let i he a local minimum in a square- optimal schedule, such that it 
is not compact, and let £{i) < k < r(z). Then the following hold: 

The (i,k) pair is not conflicting on the level f{i); 

The (z, k) pair is not conflicting on any level where i is white. 

Proof. Suppose for example, that i < k < r{i), and (i,k) is conflicting on the 
level /(z). There must be an z < s < fc such that (s, s-l- 1) is white-white on level 
/(z). Let M be the first local maximum to the left of z. Now all the finishing 
times on [M,s] are not smaller than /(z). Let z be white on the level a < /(z). 
Let’s change the levels a and /(z) on [M,s]. Since (s,s -I- 1) was white-white, 
this remains a proper schedule on [s, s -I- 1] and it reduces /(z) by at least 1. If 
it is not a proper schedule on [M — 1, M], we can correct it by increasing f{M) 
by 1 (like in the proof of Proposition 3). This remains an optimal schedule, but 
improves the former schedule concerning square-optimality, a contradiction. 

If (z, k) is conflicting on the level a, the argument is essentially the same. □ 

Corollary 2. If f{i) > x{i) then i ^ r(z)(mod2), and i ^ £(z)(mod2), other- 
wise {£{i),i) or (i,r{i)) would he conflicting on the level f{i). 

Lemma 2. Let (i,j) he a convex pair in a square- optimal schedule T. If f{i) = 
f{j) then i and j are compact and f{i) = x{i) = x{j) = f{j). 

In the proof we first point out that 'f'(z) and 'f'(j) must be the same. After 
that, a similar level-changing argument shows that they must be compact. 

Lemma 3. If i is a non-compact local minimum in a square- optimal schedule, 
then for any level a < f{i) there is at most one i < k < r{i) and at most one 
£{i) < k < i such that {k, fc -I- 1) is white-white on the level a. 
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Proof. Suppose we have i < k < k' < r{i) such that (fc, /c -I- 1) and (fc', k' + 1) are 
both white-white on level a. Note that by Lemma 1 i must be black on level a. Let 
i be white on level 6. If z = fc(mod2) then k is white on level b (see Lemma 1). 
Then we change the levels a and 6 on [fc -I- 1, fc']; if z ^ A: (mod 2) then k is white 
on the level /(z). Then we change the levels a and /(z) on [k + l,k']. In both 
cases we obtain a proper square-optimal schedule that contradicts Lemma 1, 
because (z, fc -I- 1) is conflicting on level /(z) or on level b. □ 

Now suppose (z, j) is a convex pair, /(z) > /(j), and k = pit{i,j). Theorem 1 
implies that in most cases C{i, r(z)) determines i{k), r{k), and C{£{k), k, r{k)) if 
z, r(i), and k are fixed. In turn, Theorem 2 shows that z, j, and /(z) are sufficient 
to determine r(z) and C(z,r(z)). In conclusion, the factor of in the running 
time of the rough algorithm can be reduced to p. 

The proof of Theorem 1 is basically the same as that of Lemma 3: 

Theorem 1. Let (i,j) be a eonvex pair, k = pit{i, j), and k be non-eompact in 
a square- optimal schedule. 

(1) If f{i,j) > max(a;(z), x(j)) then {£{k),k,r{k)) can only be black-white-black, 
black-black-white, white-black-black or white-black- white. 

(2) If f{i,j) = max(a;(z), x(j)) then in addition to the previous four, a black- 
black-black triple is also possible. 

In Fig. I, for example, (1) holds if (i,k,j) = (7,10,16), and (2) holds, if 
(i,k,j) = (0,7,16). Compare the levels 2 and 3 of {£{k),k,r{k)) = (2,7,14). 

Definition 9. Let {i,j) be a convex pair. According to f{i) > f{j), f{i) < f{j), 
Of f{i) = f{j) we will use the name relevant scheme of (i,j) for the scheme 
C(z,r(z)), orC{i,j), respectively. 



Theorem 2. Let {i,j) be a convex pair in a square- optimal schedule. If either 
fihj) 7 ^ max(a;(z),a;(j)) or {f{i,j) = max{x{i),x{j)), and k = top{i,j)) is 
known, then the relevant scheme of (i,j) can be computed in 0{n) time. 

Sketch of proof. Note that interestingly, knowing if /(z) or /(j) is larger, is not 
a condition in the theorem. Namely, it is either obvious or irrelevant: If, e.g., 
f{i,j) > x{j — 1) + x{j) then f{i,j) = f{i) > f{j) and we have to compute r(z) 
and then C(z,r(z)). On the other hand, if max(x(z), a;(j)) < /(z, j) < min(a;(z)-|- 
x(z-l-l), a;(j)-|-a;(j—l)) then exactly one of /(z) = f{j),I{j) = z or r(z) = j holds, 
and we have to compute C{i,j). In any case, we need a C(s, t) value, where (s, t) 
is not white- white on any level, because either one of them is compact, or one is 
a non-compact local minimum and they are of different parity. We get that the 
number of white-black levels is f{s,t) — x(s), the number of black-white levels 
is /(s, t) — x{t), and the number of black-black levels is x(s) -I- x{t) — f{s, f). 

Now we show how to obtain r(z) if, e.g., f{i,j) = f{i) > f{j). First let /(z) > 
x{i). Note that r(z) ... (j — 1) is a (possibly empty) series of step-downs. Let s be 
a node, for which x(s — 1) -I- x(s) > /(z) > a;(s) -I- x(s -I- 1) > ... > x{j — 1) + x{j) 
holds. If such an s exists, we can find it in 0{n) steps. We claim that r(z) = s. If 
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r (i) — 1 is also a step-down, then the claim is trivial, if it is a local maximum, then 
a short argument is needed to prove it. Second, if /(z) = x{i) and k = top{i,j) 
then r(z) = max(/c -I- 1, s), or r(z) = fc -|- 1 if s does not exist. □ 

The following two simple lemmas describe two basic steps of the algorithm: 

Lemma 4. Let (i,j) be a convex pair of minima in a square- optimal solution L', 
and let k = top^{i,j). We can compute fip{k) in 0{n) steps if f^{i,j) is known. 

Sketch of proof. If A: = top{i, j), then the schedule on the stairs [z -I- 1, A: — 1] and 
[A: -|- 1, j — 1] is determined by >A'(z) and 'f'(j) in a greedy manner. Finally, the 
local maximum k is black on a level if and only if (A: — 1, A; -|- 1) is white-white. 

E.g., let /(z) > /(j). We proceed along the levels from the bottom up, and 
sum up the number of levels, where k is black. Under level f{i,j) we obtain this 
number from (7(z,r(z)), where (7(z,r(z)) can be computed in 0{n) time. Since 
above the level f{i,j) we know the schedule on [z-l-1,;/ — 1], here the summing up 
is straightforward by way of merging two increasing number series: the finishing 
times of stairs on the two sides. Merging and summing takes 0{n) time. □ 

Lemma 5. Let (i,j) be a convex pair of minima in a square- optimal solution 'L', 
and let k = pit^{i,j). If f’piijj) is known and f’piijj) > max(x(z), a;(j)), then 
it is possible to compute f<p{k) in 0{n) steps. 

The proof of Lemma 5 is much the same as that of Lemma 4: we proceed from 
the bottom and sum up the levels where k is black (see Theorem 1), until the 
sum reaches x{k). We also obtain £{k) and r{k). Under f{i,j) we use C'(z,r(z)), 
above that we know the schedule on [i -\- l,£{k)] and on [r(A:), j — 1]. 

We will run the algorithms of Lemma 4 and 5 on all possible {i,j, f{i,j), k). 
The results, denoted by fmaxA(i,j, f, k) and fminA{i, j, f, k), respectively, pro- 
vide the finishing time of A;, if (z, j) is a convex pair and k = top{i, j) resp. pit{i,j) 
in a square-optimal schedule. If the setting {i,j, /, k) is not feasible, we may get 
no numeric result, which we denote by fmax/ fmin{i,j, /, k) = oo. 

We contimue with an exact description of the algorithm. It has two phases: 
Phase 1. In this phase we compute an optimal structure: locations and finish- 
ing times of local minima, and locations of local maxima. We will not do the 
scheduling here, but we will get the optimum sum of finishing times as a result. 

For all 0 < z < J < rz -|- 1 where i -\-2 < j and all / where max(a;(z), x{j)) < 
f < 3 max(a;(z), x(j)), we will compute FA{i,j,f) with the following meaning: if 
in a square-optimal solution (z, j) is a convex pair of minima, and f^{i,j) = f, 
then X)s=i+i /i'('S) = PA{i,j, /)• We proceed from short [z, j] subpaths to longer 
ones, using a dynamic programming strategy. First we compute 

fc-i 

Fmax= min ( > (xis — 1) -\- xis)) -\- fmaxAii, ji f ik)-\- 

S=l-\-l 

J-1 

+ (a:(s + 1) + a:(s))) 

s—k-\-l 
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for d — 2 . . . n 1 do 

for 0 < 2 <j j — i — d do 

for / = max(a:(z), x{j)) ... 3 max(ai(i), x{j)) do 
for k — — 1 do 

compute fmax{i,j,f,k) in time 0{n) (Lemma 4) 
end for 

Fmax ■.= mini<fe <3 (a:(s - 1) -|- a:(s)) -|- fmax{i, j, /, k) + + 1) + ^(«))) 

top{i, j, f) the k providing the minimum value 
for k = i-\-2...j — 2 do 
if / = max(ai(z), ai(j)) then 

compute fc, /') + /' -I- F(k,j, /')) 

fmin{i,j, /, k) := the / providing the minimum 
else 

compute f, k) in time 0(n) (Lemma 5) 

end if 
end for 

Fmin mini< k<j(F{i, k, f, k)) + fmin{i,j, f, k) + F{k,j, fmin(i,j, /, fc))) 

j, f) the k providing the minimum value 
if Fmax < Fmin then 

F{i, j, f) Fmax and pit{i, j, f) 0 

else 

F{i, j, f) Fmin and top{i, j, f) 0 

end if 
end for 
end for 
end for 



Fig. 2. Phase 1 of the 0{n'’p) algorithm. 



in O(n^) time (see Lemma 4). Second, if / > max(a;(z), a;(j)) then we compute 



Fmin= min {FA{i,k, fminA{i,j, f,k)) + fminA{i,j, f,k)+ 
ke[i+2,j-2] 



+FA{k,j, fmin A (i,j,f,k))) 

in 0{n^) time (see Lemma 5). If / = max(x(i), a;(j)), then we have to test for 
all the possible values of the finishing time of A: = Thus, we compute 



Fmin = 



min 

ke[i+2,j-2]J' e[x{k),3x{k)] 



{FAii,k,f)+r+FA{k,j,n) 



in 0{np) time. If k and f provided the minimum sum, then we assign the f 
value to fminA(i, j, /, k). 

We obtain FA{i,j, f) = To!m{F max , F min) . 

Again, FA{i,j,f) = oo means that (z,j) has lost the chance to be a convex 
pair of minima with f{i,j) = / in any square-optimal schedule. If Fa {i,j, f)^oo 
then together with the value FA{i,j,f) we also store which k £ [z -I- l,j — 1] 
resulted the minimum. If it was a unique local maximum (that is if Fmax < 
Fmin) then we record it by topA{i,j, f) = k. Otherwise we record pitA{i,j, f) = 
k. The computation for one (z,j) pair takes 0{n^p) time. Overall computation 
of Phase 1 takes O(zz^p) time. 

The proof of the following theorem is straightforward by induction on the 
lengths of subpaths: 

Theorem 3. If <P is a valid schedule of the given demands on the path [l,rz], 
then YFs=i /<?(«) > FA{0,n+ 1,0). 
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procedure mainQ 
schedule{Oy n + 1, 0) 



procedure schedule{i, j, f) 
if top(i^j, /) 0 then 

fc := top(i,j,f) 
greedy{i, k — 1) 
greedy{j, fc + 1) 
greedy{k) 
else 

fc := f) 

fk := fmin{i,j, /, fc) 
77iinschedule{i, j, f, fk) 
schedule{i, fc, fk) 
schedule{k, j, fk) 
end if 



procedure greedy{k) 
make fc black on the smallest 
a:{fc) levels where both 
fc — 1 and fc + 1 are white 



procedure minschedule{i, j, k, fk) 
i{k) min{s > i\x(s + 1) + a:(s) > fk} 
r(fc) max{s < j\x{s — 1) + a?(s) > /fc} 
greedy{i, i{k)) 
greedy{j, r(k)) 

first make fc black below /fc, where it doesn’t 
conflict with both of 'i'{£{k)) and <Z'(r(fc)) 
if there are < x{k) black levels then 

fill up from the bottom the missing black levels 
end if 

procedure greedy{i, j) 
if i <. j then 

for fc z + 1 . . . / do 

if fc is not scheduled then 

make fc black on the smallest x{k) 
levels where fc — 1 is white 
end if 
end for 
end if 

if / < z then 

proceed symmetrically 
end if 

Fig. 3. Phase 2. 



Phase 2. In this phase we give an optimal schedule <P. Now we proceed from 
long subpaths - starting with [0, n + 1] - to shorter ones. Phase 1 provided the 
local minima in the form of pUaO values and their finishing times as fminA^), 
and the local maxima in the form of topA{) values. While computing the schedule 
of these nodes we will basically follow the same algorithm as in Lemma 4 and 5 
when we calculated their finishing times. Along the way, it is straightforward to 
show - we won’t do this here - that <P is a valid schedule and optimal, because 
/<f (®) = ^a( 0, n + 1, 0) (see Theorem 3). 

First we take pitA{0,n + 1,0) or topA{0,n + 1,0). One of them must exist, 
because there exist valid schedules on the path. 

If we have a k = topA{0, n + 1, 0) then there is just one local maximum in 
our optimal schedule. Then we do the scheduling in the greedy way on the stair- 
ups of [l,fc — 1] and on the stair-downs of [k + l,n]. Finally we do the greedy 
scheduling on k, that is, we make k black where (k — 1, k + 1) is white-white. 

If on the other hand we have k = pUa{ 0, n+1, 0), then k will be the (leftmost) 
smallest local minimum in <P. We have the finishing time /<j(fc) := fminAifi, n + 
1,0, fc) as well. Let f<i>{k) yf x{k) (otherwise >P{k) is trivial). Now we obtain 
i{k) and r(fe) again, while we are doing the greedy scheduling on the stairs 
[l,t'(fc)] and [r(fc),n]. In the meantime we schedule fc : since fc is not compact, 
£{k) ^ fc ^ r(fc)(mod2), and we make fc black where (£(fc),r(fc)) is not black- 
black. If x{k) is not used up till we reach fminA{0, n+ 1, 0, fc), then we are free 
to assign the ’extra’ black levels from bottom up (see node 7 on Fig. 1). 

Now we proceed recursively, first make the schedule on [0, fc] then on [fc, n-|-l]. 
Exactly one of top/i(0, fc, /mfn^(fc)) or pitA{0,k, fminA{k)) exists, in the 
first case we make the schedule on the whole subpath [0, fc], in the latter case on 
fc' = pit{0, fc), on [£{k) + l,£(fc')], and on [r(fc'), fc — 1]. And so on... 
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Computing the pit a, fminA etc. values in Phase 1, corresponds to creating 
the schedule in Phase 2. This guarantees that we make ends meet: e.g., the 
fminA values are not higher than the stairs on both sides, and all these details. 

Analysis. Let a be a preemption level, if there is a node i s.t. i is black on a 
and white on a -I- 1, or vice versa. In our schedule, any preemption level is either 
the finishing time of a node, or the last of extra black levels in a local minimum 
(e.g. level 2 of node 7 on Fig. 1). Hence, there are 0{n) preemption levels. We 
could record the preemption levels of any node, by way of merging or revisiting 
preemption levels of at most two other nodes. So, Phase 2 requires O(n^) time. 

5 Conclusions 

The 0{n^p) Algorithm on Paths and Cycles. Let’s regard a fixed triple (*, j, /) 
in Phase 1. We claim that we can compute fmaxA{i,j, /, k) and fminA{i, j, f, k) 
overall for all k G [i -I- 1, j — 1] in 0{n) time. This reduces the time to 0{n^p) 
(compare Figure 2). First, in Theorem 2, the part of the computation taking 0(n) 
time doesn’t depend on k. Second, in Lemma 4 and 5, the merging and summing 
procedure is exactly the same for fc’s of the same parity. If the x{k) demands of 
e.g. all even k's are sorted in non-decreasing order, during the merging of the 
two sides and summing the black levels, we can in parallel record the fmax resp. 
fmin finishing times of all the fc’s (sorted by demands). Let’s sort the odd and 
even nodes by their demands at the beginning of Phase 1. 

Finally, let’s regard the case of cycles: First we compute FA{i,j,f) for 
all [i,j] subpaths of the cycle and all possible / values. Since on a cycle 
a node of minimum finishing time is compact, the optimum sum will be 
mini<i<„(FA(i, i, x(z)) -I- x(z)), where FA{i,i,x{i)) is the computed optimum 
on i+l,i + 2, . . . ,i + n — l = i— l(modn). Finally, we start the scheduling with 
the compact local minimum. 

Future work. It remains challenging to find an algorithm for this problem, poly- 
nomial in n (and logp). We firmly believe that this is possible. 

There is no obvious way to exploit our idea on conflict-graphs of maximum 
node degree > 3. However, it may be interesting to examine graph-classes with 
just a small number of nodes of degree > 3. 

Acknowledgements. I would like to thank Daniel Marx for the idea of con- 
sidering square-optimal schedules. Special thanks to Katalin Friedl, for directing 
me to the problem, and for her advice and lot of help during the writing of this 
paper. 
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Abstract. We present an improved average case analysis of the max- 
imum cardinality matching problem. We show that in a bipartite or 
general random graph on n vertices, with high probability every non- 
maximum matching has an augmenting path of length O(logn). This 
implies that augmenting path algorithms like the Hopcroft-Karp algo- 
rithm for bipartite graphs and the Micali-Vazirani algorithm for general 
graphs, which have a worst case running time of 0{my/n), run in time 
O(mlogn) with high probability, where m is the number of edges in the 
graph. Motwani proved these results for random graphs when the av- 
erage degree is at least ln(n) [Average Case Analysis of Algorithms for 
Matchings and Related Problems, Journal of the ACM, 41(6), 1994]. Our 
results hold, if only the average degree is a large enough constant. At the 
same time we simplify the analysis of Motwani. 



1 Introduction 

We consider the problem of computing a matching of maximum cardinality in 
an undirected graph G = (V, E) with vertex set V and edge set E. A matching 
is a subset M C E of the edges of G such that no two edges in M have a vertex 
in common. The edges in M are called matching edges, edges not in M are called 
free edges. A vertex is matched if it has an incident matching edge, otherwise it 
is free. 

Augmenting Path Algorithms. Most matching algorithms are augmenting path 
algorithms. An augmenting path for a non-maximum matching M is a simple 
path between two free vertices, where the edges along the path are alternately 
free edges and matching edges. For every non-maximum matching, an augment- 
ing path exists (e.g., obtained by taking the symmetric difference of the set of 
matching edges with the edge set of an arbitrary optimal matching) . By making 
each free edge a matching edge and vice versa along such a path, a matching 
that is larger by one edge is obtained. Augmenting path algorithms search for 
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augmenting paths and augment, until the matching is maximum. The algorithms 
differ in the way they search for augmenting paths. 

Complexity. Maximum matchings can be computed efficiently. Let n and m 
denote the number of vertices and edges of G, respectively. In bipartite graphs, 
the algorithm of Hopcroft and Karp [10] computes a maximum matching in 
time 0{my/n). For dense graphs, i.e., with m = 0{n^), slightly better algo- 
rithms are known. Cheriyan and Mehlhorn [3] obtained 0(n^'^ / logn) and Feder 
and Motwani [7] achieved, via graph compression, 0{m^/n/cp{n,m)), where 
(p{n,m) = log n/ log (n^/m). In general graphs, Edmonds’ blossom-shrinking al- 
gorithm [5,4,8] computes a maximum matching in time 0(nma(m,n)), where 
a{m,n) denotes the inverse of Ackermann’s function. Micali and Vazirani [13] 
gave an 0{m^/n) algorithm, which is similar to the algorithm of Hopcroft and 
Karp for bipartite graphs. 

The algorithms of Hopcroft and Karp [10] and Micali and Vazirani [13] are of 
particular interest in this paper. The algorithms run in phases. In each phase we 
first construct a maximal set of vertex-disjoint shortest augmenting paths, and 
then augment the current matching along these paths. A phase requires time 
0{m). In both algorithms the length of the shortest augmenting path strictly 
increases from one phase to the next and thus a bound on the maximal length 
of shortest augmenting paths implies a bound on running time: If every non- 
maximum matching in a bipartite (general) graph has an augmenting path of 
length at most /(n), then the Hopcroft-Karp (Micali-Vazirani) algorithm runs 
in time 0{m ■ f{n)). 

In practice, augmenting path algorithms perform significantly better than 
suggested by the worst case running times, see, e.g., [11,2]. The worst case run- 
ning time seems to be an over-pessimistic estimation of the actual running time 
in practice. We are therefore interested in the average case behavior of augment- 
ing path algorithms. 

Random Graph Models. We define the probability distribution on graphs ac- 
cording to the model introduced by Erdos and Renyi [6]. We consider both 
bipartite and general graphs. We denote by G(n;n) the set of all undirected 
bipartite graphs with n vertices on each side, and by G{n;n;p) the probability 
distribution on G{n;n), where each of the potential edges is present with 
probability p, independent of other edges. Similarly, we denote by G(n) the set 
of all undirected graphs with n vertices and by G{n;p) the probability distri- 
bution on G(n), where each of the n(n — l)/2 potential edges is present with 
probability p, independent of other edges. The average degree of each vertex in 
a graph drawn from G{n;n;p) or G{n;p) is pn and p{n — 1), respectively. We 
will use c to denote the average degree of a random graph. 

Our Results. We prove that in a random graph drawn from G{n;n; c/n) or 
from G{n;c/{n — 1)), with high probability every non-maximum matching has 
an augmenting path of length O(logn), if only c is above a certain constant. 
For bipartite graphs, our analysis requires that c > 8.83, for general graphs it 
requires that c > 32.67. It follows that under these conditions, the running time 
of the algorithms of Hopcroft and Karp on bipartite random graphs and Micali 
and Vazirani on general random graphs is 0(m log n) with high probability. 
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We conjecture the existence of short augmenting paths for every value of c. 
Observe that for tiny values of c, for example c < 1, all paths are of length 
O(logn) and hence also all augmenting paths must be short. It is conceivable 
that our analysis can be strengthened so as to cover all values of c; we comment 
further on this in our conclusions. 

Related Work. Motwani [12] presented the first average case analysis for match- 
ing algorithms. He showed that every non-maximum matching in a random graph 
from G(n; n; c/n) or from G(n; c/(n — 1)) with c > In n has a logarithmic length 
augmenting path with high probability. The analysis rests on two key observa- 
tions: (i) expander graphs^ admit short augmenting paths with respect to any 
non-maximum matching, and (ii) random graphs with c > In n are structurally 
so similar to expander graphs that the short augmenting path property carries 
over. Motwani’s analysis breaks down when c is significantly below In n. When c 
is constant, for example, with high probability a constant fraction of the vertices 
is isolated and a constant fraction of the vertices has degree one, and such graphs 
are certainly not structurally similar to expanders. 

Novelty. Nevertheless, on a high level our approach is similar to that of Mot- 
wani. We grow alternating trees as they are constructed in augmenting path 
algorithms at two free vertices connected by an augmenting path and show that 
the trees meet with high probability after G(logn) layers. Our main technical 
lemma states that such trees exhibit exponential growth after G(logn) layers; 
we remark that they may stay skinny for up to G(logn) layers. In the proof, we 
exploit several structural properties of these trees, such as connectivity, degree- 
one descendence due to the matching edges, etc. In contrast to this, Motwani 
works with expansion for plain sets of vertices, which only holds for c > In n and 
gives rise to several complications in the analysis, which we can avoid here. Our 
analysis is therefore at the same time stronger and simpler. 



2 Main Result 

In this section we state our main result. Theorem 1, explain the central ideas of 
its proof, and give an overview of the rest of the paper. 

Theorem 1. There is a constant cq such that a random graph from G{n; n; c/n) 
or from G{n; cjln — 1)), where c > Cq, with high probability has the property that 
every non-maximum matching has an augmenting path of length O(logn). In a 
graph with this property, a maximum matching can be computed in 0(m log n) 
time, where m is the number of edges. 



Remark 1. For a random graph from G(n',n]cln), the theorem holds for c > 
8.83. For a random graph from G(n; c/(n — 1)), it holds for c > 32.67. 

^ In an expander graph the cardinality of the set of neighbors of any vertex set S with 
|S| < n/2 is at least (1 -I- ejlSI for some positive constant e. 




84 



H. Bast et al. 



A central notion in our analysis will be that of an augmenting path tree. 
Augmenting path trees arise in the standard breadth-first search for augmenting 
paths for a given non-maximum matching: start from a free vertex, add all its 
neighbours, if none of them is free (otherwise an augmenting path is found) add 
all the incident matching edges and their other endpoints, and so on. We first 
give the formal definition, then Figure 1 provides an example. 

Definition 1. For a rooted tree T, let Even(T) denote the set of vertices at 
even non-zero levels (i.e., excluding the root), and let Odd(T) denote the set of 
vertices at odd levels, where the root has level 0, its children have level 1, and so 
on. The largest level of a vertex in T is denoted by depth(T). 

An augmenting path tree is a rooted tree T of even depth, where each vertex 
of Odd{T) has exactly one child; in particular, |Odd(T)| = |Even(T)|. An aug- 
menting path tree is for a particular matching, if its root is free with respect to 
that matching, and all edges between an odd level and the next larger even level 
are in the matching. 



fT'i 

m A 

Fig. 1. Left: An augmenting path tree T with |Even(T)l = |Odd(T)l = 8. Right: The 
tree with vertices on odd levels “removed”, as used in the proof of Lemma 2. 




Our approach to proving Theorem 1 is as follows. Given a non-maximum 
matching, we pick the two free vertices of an augmenting path, and from each 
of these vertices we grow two augmenting path trees T\ and T 2 . The following 
lemma names a set of properties, which are sufficient for the existence of a short 
augmenting path. 

Lemma 1. Let T\ and T 2 be two augmenting path trees for a given non- 
maximum matching in a given graph. Then the following properties imply that 
there is an augmenting path of length at most depth(Ti) -|- depth(T 2 ) + 1 

(a) Ti and T 2 are (vertex and edge) disjoint; 

(b) One of the following holds: 

(bl) either there is a free vertex adjacent to Even(Ti) or to Even(T 2 ), 

(b2) or there is an edge between Even(Ti) and Even(T 2 ). 
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Proof. If property (bl) holds, there is an augmenting path via just one of the 
trees, of length at most max{depth(Ti), depth(T 2 )} + 1. If property (b2) holds, 
then owing to (a) there is an augmenting path from the root of Ti to the root 
of T 2 of length at most depth(Ti) + depth(T 2 ) + 1. 

□ 

Our construction of the trees Ti and T 2 with these properties will be incre- 
mental, terminating as soon as property (bl) or (b2) is fulfilled. In Section 3, 
we will give the construction for bipartite random graphs. In Section 4, we deal 
with general random graphs. 

The main difficulty will be to prove that the construction terminates with 
at most logarithmic depth for both trees. The key will be the following lemma, 
which establishes an expansion property for augmenting path trees, when the 
average degree is above a certain constant. 

While the lemma is formulated and proven completely independently from its 
later use, some readers might prefer to first study the construction from Section 
3 in more detail, see how the lemma is used there, and then come back to this 
section. In the lemma below, as well as in our constructions, we will use Pg{X) to 
denote the neighbourhood of a vertex set X in G, i.e., the set of vertices adjacent 
to X in G. 

Lemma 2. For each e > 0 and fd > 1 + e, there exist constants a and cq such 
that a random graph G from G{n; n; c/n) or from G{n; c/(n — 1)), where c > cq, 
with high probability has the following property: for each augmenting path tree T 
with a-logn < |Even(T)| < n/(i, it holds that |/cj(Even(T))| > (l-|-£)-|Even(r)|. 



Remark 2. For a random graph from G{n; n; c/n), for e = 0.001 and j3 = 2.57, 
the lemma holds with cq = 8.83. For a random graph from G(n; c/(n — 1)), for 
£ = 2.01 and /3 = 6.03, the lemma holds with cq = 32.67. These will be the 
settings when we apply the lemma in Sections 3 and 4. The derivation of these 
constants is explained at the end of Section 3. 

Proof. If a graph G does not have the property from the lemma, the following 
bottleneck‘d constellation occurs in G: 

(i) an augmenting path tree T with alogn < |Even(T)| < n//3; 

(ii) a set T 3 Odd(T) with |T| < (1 -I- £) • |Even(T)|; 

(iii) for each vertex from T’\Odd(T), an edge to a vertex from Even(T) (the 
edges from Odd(T) to Even(T) are already taken care of in (i)); 

(iv) no edge between Even(T) and V\F, where V is the set of all vertices of G. 

We will show that the probability that any such bottleneck constellation 
occurs is polynomially small in n. We first give the proof for a bipartite random 
graph, and then describe the (few) changes required for a general random graph. 

If a fixed bottleneck constellation occurs in a graph from G(n;n), the fol- 
lowing events occur, where we write I = |Even(T)| and r = |T|: (i) the 21 edges 
from T are present, (ii) the r — I edges from F\Odd{T) to Even(T) are present, 
and (iii) none of the l{n — r) edges between Even(T) and V'\F are present, 

d In his work, Motwani uses this name in a related context. 
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where Y' is the side of the bipartite graph containing F and we exploit that in 
a bipartite graph Even(T) and F lie on opposite sides of the graph. It follows 
that the probability that each of these, obviously independent, events occurs in 
a random graph from G(n; n; c/n) is at most 



(c/n)'+"-(l-c/n)'(”-"\ 



which, using that / < n//3, ^ < r < (1 + e) • ?, and 1 — c/n < e is bounded 
by 

^-(/+r) . ^(2+e)./ . g-c(l-(l+e)//3)-J ^ 

\^gc(l-(l+e)//3) 

The number of potential bottleneck constellations, i.e., the number of dif- 
ferent bottleneck constellations in the complete bipartite graph on 2n vertices, 
with |Even(T)| = I and |T| = r is (i) the number of augmenting path trees T 
with |Even(r)| = I, times (ii) the number of ways to choose the r — I vertices for 
F\Odd{T) from E'\Odd(T), where V are the vertices on that side of the bipar- 
tite graph containing Odd(T) (vertices on the other side of the graph cannot be 
in the neighbourhood of Even(T)), times (iii) the number of ways to choose for 
each of these r — I vertices an edge to one of the I vertices from Even(T). 

Clearly, the number for (iii) is and the number for (ii) is (”Zj) ^ (r-i)- 
To count the number of augmenting path trees T with |Even(T)| = I, observe 
that via “removing” the vertices in Odd(T), as illustrated by an example in 
Figure 1, each such tree corresponds to a unique combination of a tree on / -|- 1 
vertices, and a sequence of I distinct vertices. By Cayley’s theorem [1] the number 
of trees on I -|- 1 vertices is {I + 1)*“^, and the number of sequences of I distinct 
vertices from one side of a graph from G{n; n) is n ■ (n — 1) ■■■ (n — I + 1) < nb 

The total number of potential bottleneck constellations in a G(n; n) graph is 
hence at most 









jr — l I ^ r+/ + l r+1 

■l -n < n -e 



* <n’'+*+be’'+C(e-®)*, 



where the last inequality holds^ for r < (1 -|- e) • Z and e < 1/e. 

Combining the bounds, we conclude that a random graph from G(n; n; c/n) 
contains any bottleneck constellation with |Even(T)| = I and |T| = r, with 
probability at most 

en- j =en-q, 

where q is just an abbreviation for the fractional term. For sufficiently large c, 
we have g < 1; in particular, this holds for the values stated in the remark to 
the lemma: e = 0.001, (3 = 2.57, and c > 8.83. 

We finally sum over all r, I with a ■ logn < I < n/ (3 and Z < r < (1 -I- e) • Z, 
and get a total probability of at most 

en^ . ^alogn = g^3-alog(l/,)^ 

® Let r = (1 -I- n) • Z with 0 < k < e. If n = 0, the claim is obvious (recall 0° = 1). 
If K > 0, we have (Z/(r — Z))”“* = (l/n)"’ * < (1/e)®'*, since (l/n)'‘ is increasing for 
n < 1/e. 
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which for sufficiently large a is polynomially small in n. This finishes the proof 
of Lemma 2 for bipartite random graphs. 

In a random graph from G(n; cj(n — 1)), the bound on the probability that 
a fixed constellation with |Even(T)| = I and \F\ = r occurs, is 

(c/(n - nr' . (1 - < („ - 1 )-<‘+'> . . 

The number of bottleneck constellations can be bounded just like before by 

/7 I 1\^~1 ( \ jr — l I ^ r+i + 1 r+1 f ^ \ ^ + I r+1 

•0 + 1) '(r-l)' \~l) 

where the last inequality now holds for r < (1 + er) •/, but without restriction on e 
(for arbitrary random graphs, we will apply the lemma with s > 2). Altogether, 
a random graph from G(n; cj(n — 1)) then contains any bottleneck constellation 
with probability at most 

H e^n- 

cn log n<l<n l<r<n 

where the additional factor comes from (n — !)“(*+’’) • < (1 + ^/{n — 

2 ^))r+; ^ is just a number > 1+ln 1.45, and q is again an abbreviation 

for the fractional term. For sufficiently large c, and in particular for e = 2.01, 
[3 = 6.03 and c > 32.67, we have g < 1, and then for large enough a the 
probability is negligible. This proves Lemma 2 also for arbitrary random graphs. 

□ 

The following simple property of random graphs was already stated and 
proven in [12, Lemma 3(d)], except that Motwani did not make the threshold 
on c explicit. We remark that this threshold is one of the major bottlenecks for 
reducing the threshold on c in our main result. Theorem 1. 

Lemma 3. For every /3 > 1, and for c > 2 ■ 0^ ■ Fl{l/(F) ■ ln2, where Fl{x) = 
— a;log 2 X — (1 — x)log 2 (l — x) is the binary entropy function, a random graph 
from G{n; n; c/n) or from G{n; c/{n — 1)) with high probability has the property 
that every two disjoint sets of vertices, both of size at least n/P, have an edge 
between them. 




gl.38+e^2+£' \ 
gc(l-(l+£)//3) ) 



< en^ ■ = g^3-alog(l/9)^ 




Proof. The probability that no edge runs between two disjoint sets of sizes I and 
r is exactly (1 — clnf'". If two disjoint subsets of size at least n/P and with no 
edge between them exist, then there exist also two subsets of size exactly \n/P~\ 
with no edge between them (just remove the necessary number of vertices from 
each set), and this happens with probability at most 



n 

[n//3] 



(l-c/n)^"/^!' 
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(fe) — where H is the binary entropy function as stated in 

the lemma. Since H{x)/x is monotonically decreasing on (0, 1), we have 



Un//3]7 - 



The quantity (1 — c/n) we bound by e . This give us the following 

bound on the above probability 






.-c/m 



2\n/j3'\ 



This is a negligible probability, provided that c > 2 • (1 //?) • In 2. We remark 

that, had we estimated the binomial coefficient via the standard (^) < (en//c)*, 
we would have obtained the slightly more restrictive condition c > 2/3(1 + ln/3). 

□ 



3 Constructing the Trees for Bipartite Random Graphs 

For a given non-maximum matching of a graph G from G{n;n), consider an 
augmenting path and pick its two free endpoints, /i and / 2 . Note that since every 
augmenting path has odd length, in a bipartite graph these two free vertices lie 
on opposite sides of G. The following procedure constructs Ti and T 2 . 

0 . Initially let Ti and T 2 be the trees with /i and /2 as the only vertex and 
root, respectively. Each of the following iterations will add two more levels 
to Ti and to T2. 

1. Let r{T) = TG(Even(T))\Odd(T), for T = Ti,T 2 . 

2. If r{Ti) or r{T 2 ) contains a free vertex, STOP. 

3. If r{Ti) contains a vertex which is already in Even(T 2 ), or vice versa, STOP. 

4. If there is a matching edge between F{Ti) and 0 ( 12 ), add it, together with 
the endpoint and edge connecting it to (say) Ti, then STOP. 

5 . Otherwise add to T all the vertices from r{T), together with the edges con- 
necting them to Even(T) (by construction all vertices from r{T) are in fact 
adjacent to the largest level of T), for T = Ti,T2. 

6. Add the matching edges incident to r{T) together with their other endpoints 
to T, for T = Ti , T 2 . 

7 . Repeat 1.-6. 

We first show that this construction fulfills the properties of Lemma 1. When 
the procedure stops in step 2, we have property (bl). When it stops in step 3 
or 4, we have an edge between Even(Ti) and Even(T 2 ), which is property (b2). 
Since the roots of Ti and T 2 lie on opposite sides of the bipartite graph G, we 
have Even(Ti) fl Even(T 2 ) = Odd(Ti) fl Odd(T 2 ) = 0. Steps 3 and 4 ensure 
that Odd(Ti) fl Even(T 2 ) = Odd(T 2 ) fl Even(Ti) = 0, hence we have complete 
disjointness of T\ and T 2 , which is property (a). 
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It remains to show that the procedure terminates within O(logn) iterations 
(note that by what we have shown so far, the procedure could run forever, namely 
when at some point r{T) = 0 in step 1). Since each iteration adds two levels 
to each tree, the depth of the trees would then be O(logn), which by Lemma 1 
would prove Theorem 1. 

By construction, in step 6 of every iteration at least the matching edge of 
the augmenting path starting in /i is added to Ti, and the same holds for /2 
and T 2 . After alogn iterations therefore, |Even(T)| > a ■ logn. Consider an 
iteration i,iov i> a- log n, which passes steps 2-4. Let T denote one of the trees 
(the following argument holds for T\ as well as for T 2 ) at the beginning of the 
iteration, and let T' denote the tree at the end of the iteration, with two new 
levels added to it. We apply Lemma 2 with e = 0.001 and j3 = 2.57; the value 
for £ is just a small one satisfying the requirement £ > 0 of Lemma 2, the choice 
for (3 will be explained in the next but one paragraph. When |Even(T)| < n/j3, 
Lemma 2 gives that |r’G(Even(T)) | > (! + £)• |Even(T)|. Since |Even(T')| = 
|Even(T)| + |T(T)| = |Even(T)| + |TG(Even(T))\Odd(r)| = |TG(Even(T))|, we 
have |Even(T')| > (! + £)• |Even(T)|. This proves that when the procedure runs 
for alogn + logi_|_£(n//3) = O(logn) iterations, then certainly |Even(T)| > n/j3, 

for T = Ti,r 2 . 

Consider the first iteration, where both |Even(Ti)| and |Even(T 2 )| are at least 
n/(3. By property (a), already established above, the two sets are disjoint, hence 
by Lemma 3, with high probability there is an edge between them. With such 
an edge, the procedure stops in step 3. This proves that with high probability 
the procedure terminates within O(logn) iterations, and hence with two trees 
of depth O(logn). This finishes the proof of Theorem 1 for random bipartite 
graphs. 

We finally comment on our choice oi (3 = 2.57 above, and how it leads to 
the requirement c > 8.83 in Theorem 1. Both Lemma 2 and Lemma 3 put a 
lower bound on c. For Lemma 2, this bound comes from the quantity q, defined 
in the proof of that lemma, which has to be strictly less than 1; this quantity 
depends on both (3 and c, hence let us write q{(3, c). Lemma 3 gives an explicit 
lower bound on c, depending only on /?; let us write c{(3) for this bound. We are 
looking for the smallest (3, where q(/3, c(/3)) < 1, which, in turn, will give us the 
smallest c for which our argument goes through. Using Gnuplot [9], we find that 
we can choose (3 as small as 2.57; then for c > 8.83, both lemmas (just) hold. 
For the analysis of the construction for arbitrary random graphs, given in the 
next section, the values are found in the same manner, though with a different 
q (see the proof of Lemma 2), and with £ = 2.01, because the construction there 
requires that £ > 2. 

4 Constructing the Trees for Arbitrary Random Graphs 

For a given non-maximum matching of a graph G from G(n), consider an aug- 
menting path and pick its two free endpoints, fi and f 2 - The procedure for 
constructing Ti and T 2 is similar as for bipartite graphs but with three compli- 
cations: (i) two vertices from the neighborhood of Even(Ti) or of Even(T 2 ) may 
be incident to the same matching edge, so that we can add only one of them to 
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the tree (step 5 below), (ii) the disjointness of the neighborhoods of Even(Ti) 
and Even(T 2 ) has to be taken care of explicitly now (step 6 below), and (iii) 
because only part of the neighborhood of Even(T) is eventually added to T, for 
T = Ti, T 2 , starting from the free vertices alone it could now indeed happen that 
r{Ti) = 0 or r{T 2 ) = 0 in one of the first alogn iterations; therefore in step 
0 we now start with a piece of size 2 [alogn] of the augmenting path for each 
tree. 

0 . Let Ti be the prefix of length 2 [alogn] of the augmenting path starting at 

/i, and let T 2 be the suffix of length 2 [alogn] . If the two are not disjoint, 

i.e., the length of the augmenting path is 4a log n or less, remove Ti fl T 2 
from one of the trees and STOP (the properties of Lemma 1 are then ful- 
filled). Otherwise, |Even(Ti)|, |Even(T 2 )| > alogn, and each of the following 
iterations will add two more levels to both Ti and T 2 . 

1 . Let r(Ti) = TG(Even(ri))\(ri U Odd(T 2 )), and let r{T 2 ) = 

TG(Even(T2))\(T2 U Odd(Ti)). 

2 . If r{Ti) or r{T 2 ) contains a free vertex, STOP. 

3. If r(Ti) contains a vertex which is already contained in Even(T 2 ), or vice 
versa, STOP. 

4. If there is a matching edge between T(Ti) and T(T 2 ), add it, together with 
the endpoint and edge connecting it to (say) Ti, then STOP. 

5 . Let r'{T) be a maximal subset of T(T) in which no two vertices match each 

other, for T = then \r'{T)\ > [|T(T)|/2]. 

6. Let r"(Ti) C r'(Ti) and r"(T 2 ) C r(T 2 ) such that |T"(Ti)| = |T"(r 2 )| > 

[min{|T'(Ti)|, |T'(T2)|}/2] and T"(Ti) nT"(T 2 ) = 0; this takes from T'(Ti) 
and r'{T 2 ) two maximally large subsets that are disjoint and of equal size 
(the worst case is when T'(Ti) and r'{T 2 ) are equal and of odd size). 

7. Add to T the vertices from r”{T), together with the edges connecting them 
to Even(T) (not necessarily only to vertices at the largest level of T, as in 
the bipartite case), for T = Ti,T 2 . 

8. Add the matching edges incident to r”{T) together with their other end- 
points, to T, for T = Ti,T 2 - 

9. Repeat 1.-8. 

Like in the bipartite case, it is easy to see that the properties of Lemma 1 
are fulfilled. After step 0, Ti and T 2 are disjoint, and by steps 3, 4, 5, and 6, 
disjoint sets of vertices are added to Ti and T 2 in steps 7 and 8, which yields 
property (a). When the procedure stops in step 2, we have property (bl), if it 
stops in step 3 or 4, we have property (b2). 

Let T denote one of the trees at the beginning of a fixed iteration, assuming 
that it has passed steps 2-4. Assume that |Even(T)| < n/j3. Then by Lemma 2, 
applied with £ = 2.01 and j3 = 6.19, |/G(Even(T))| > (3 -I- 9£')|Even(T)|, where 
e' = 0.001. Steps 0 and 6 ensure that at the beginning and end of every iteration, 
|Even(Ti)| = |Even(T 2 )|, so that |Even(Ti)|, |Odd(Ti)|, |Even(T 2 ), |Odd(P 2 )| are 
all equal, and thus |Ti U Odd(P 2 )| = IT 2 U Odd(Ti)| = 3|Even(T)| -|- 1. Hence 
after step 1, |T(T)| = |rG(Even(T))| - (3|Even(T) + 1) > 9£'|Even(T)| - 1 > 
8e' |Even(T) | , where we assume without loss of generality that a > l/e' and hence 
e'|Even(T)| > e'alogn > 1. Then after step 5, |T'(T)| > 4£'|Even(T)|, and since 
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this holds for T = Ti and for T = T 2 , after step 6, \r"{T)\ > [4£'|Even(T)|/2j > 
2e'|Even(T)| — 1 > £'|Even(T)|. In step 8, one vertex per vertex in r"{T) is 
added, so that, if T' denotes the tree at the end of the iteration, we have 

|Even(T')| = |Even(T)| + |E"(T)| > |Even(T)| + £'|Even(T)| = (l + £')|Even(T)|. 

This proves that within O(logn) iterations, either the procedure terminates or 
at some point |Even(Ti)| = |Even(T 2 )| > n//3. 

As for the bipartite case, once Even(Ti) and Even(T 2 ) contain n/ (3 or more 
vertices each, by Lemma 3 there will be an edge between the two sets, and 
the procedure will stop in step 3. This proves that with high probability the 
procedure terminates within O(logn) iterations, so that upon termination both 
trees have depth O(logn). This finishes the proof of Theorem 1 for arbitrary 
random graphs. 

5 Conclusion 

We proved that in a random graph on n vertices with high probability every 
non-maximum matching has an augmenting path of length O(logn). Motwani 
could prove this when the average degree is at least Inn, whereas we only require 
that c is above a certain constant. Our expansion lemma is more powerful than 
Motwani ’s and at the same time makes the whole analysis simpler; in fact, the 
present writeup contains all proofs with all details. 

While the expansion property on which the analysis in [12] is built does 
not hold when c is significantly smaller than In n, our condition on c does not 
appear to reflect a principal limit of our analysis. More refined versions of Lemma 
2 (an idea would be to consider augmenting path trees which have expansion 
not in every level but only over a certain constant number of levels, we have not 
pursued this further yet), and of Lemma 3 (so far, we have not exploited the 
special structure of the two large sets between which we need an edge), might 
well be able to do without any condition on c. 
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Abstract. We use tools from the algebraic theory of automata to inves- 
tigate the class of languages recognized by two models of Quantum Finite 
Automata (QFA): Brodsky and Pippenger’s end-decisive model, and a 
new QFA model whose definition is motivated by implementations of 
quantum computers using nucleo-magnetic resonance (NMR). In partic- 
ular, we are interested in the new model since nucleo-magnetic resonance 
was used to construct the most powerful physical quantum machine to 
date. We give a complete characterization of the languages recognized by 
the new model and by Boolean combinations of the Brodsky-Pippenger 
model. Our results show a striking similarity in the class of languages 
recognized by the end-decisive QFAs and the new model, even though 
these machines are very different on the surface. 



1 Introduction 

In the classical theory of finite automata, it is unanimously recognized that the 
algebraic point of view is an essential ingredient in understanding and classify- 
ing computations that can be realized by finite state machines, i.e. the regular 
languages. It is well known that to each regular language L can be associated a 
canonical finite monoid (its syntactic monoid, M{L)) and unsurprisingly the al- 
gebraic structure of M{L) strongly characterizes the combinatorial properties of 
L. The theory of pseudo- varieties of Eilenberg (which in this paper will be called 
M-varieties for short) provides an elegant abstract framework in which these 
correspondences between monoids and languages can be uniformly discussed. 

^ Research supported by Grant No. 01.0354 from the Latvian Council of Science; Eu- 
ropean Commission, contract IST-1999-11234. Also for the fourth and fifth authors, 
the University of Latvia, Kristaps Morbergs fellowship. 

* Research supported by NSERC and FCAR. 



V. Diekert and M. Habib (Eds.): STAGS 2004, LNCS 2996, pp. 93—104, 2004. 
© Springer- Verlag Berlin Heidelberg 2004 




94 



A. Ambainis et al. 



Finite automata are a natural model for classical computing with finite mem- 
ory, and likewise quantum finite automata (QFA) are a natural model for quan- 
tum computers that use a finite dimensional state space as memory. Quantum 
computing’s more general model of quantum circuits [19] gives us an upper bound 
on the capability of quantum machines, but the fact that several years have 
passed without the construction of such circuits (despite the efforts of many sci- 
entists) suggests that the first quantum machines are not going to be this strong. 
Thus it is not only interesting but practical to study simpler models alongside 
of the more general quantum circuit model. 

There are several models of QFA [17,15,8,5,9,7] which differ in what quantum 
measurements are allowed. The most general model (independently [9] and [7]) 
allows any sequence of unitary transformations and measurements. The class 
of languages recognized by this model is all regular languages. In contrast, the 
model of [17] allows unitary transformations but only one measurement at the 
end of computation. The power of QFAs is then equal to that of permutation 
automata [17,8] (i.e. they recognize exactly group languages). In intermediate 
models [15,8,5], more than one measurement is allowed but the form of those 
measurements is restricted. The power of those models is between [17] and [9,7] 
but has not been characterized exactly, despite considerable effort [4,2] . The most 
general model of QFAs describes what is achievable in principle according to laws 
of quantum mechanics while some of the more restricted models correspond to 
what is actually achieved by current implementations of quantum computers. 

In view of the enduring success of the algebraic approach to analyze classical 
finite state devices, it is natural to ask if the framework can be used in the 
quantum context as well. The work that we present here answers the question 
in the affirmative. We will analyze two models of QFA: the model [8] and a 
new model whose definition is motivated by the properties of nucleo-magnetic 
resonance (NMR) quantum computing. Among various physical systems used to 
implement quantum computing, liquid state NMR has been the most successful 
so far, realizing quantum computers with up to 7 quantum bits [26]. Liquid 
state NMR imposes restrictions of what measurements can be performed, and 
the definition of the new model reflects this. In both cases we are able to provide 
an algebraic characterization for the languages that these models can recognize. 
It turns out that the class of languages recognized by these two models coincide 
almost exactly (that is, up to Boolean combinations), which is quite surprising 
considering the differences between the two models (for example, the NMR model 
allows mixed states while the [8] model does not). It is a pleasant fact that 
the M-variety that turns up in analyzing these QFA is a natural one that has 
been extensively studied by algebraists. Besides using algebra, our arguments 
are also based on providing new constructions to enlarge the class of languages 
previously known to be recognizable in these models, as well as proving new 
impossibility results using subspace techniques (as developed in [4]), information 
theory (as developed in [18]), and quantum Markov chains (as developed in [3]). 
In particular, we show that the Brodsky-Pippenger model cannot recognize the 
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language aS* (it is already known [15] that S*a is not recognizable), and that 
our new quantum model cannot recognize aS* or S*a. 

The paper is organized as follows. In Section 2 we give an introduction to the 
algebraic theory of automata and we define the models. In the next two sections 
we give our results on the two models we introduced, and in the last section we 
outline some open problems. 

2 Preliminaries 

2.1 Algebraic Theory of Automata 

An M- variety is a class of finite monoids which is closed under taking sub- 
monoids, surjective homomorphisms, and direct products. Given an M- variety 
V, to each finite alphabet S we associate the class of regular languages 
12(27*) = {L C E* : M{L) G V}. It can be shown that V(27*) is a Boolean 
algebra closed under quotients (i.e. if L G V(27*) then for all w G S* we have 
w~^L = {x : wx G L\ G V(27*) and Lw~^ = {a: : xw G L} G V(27*)) and inverse 
homomorphisms (i.e. if (/? : 27* — >■ 27* is a homomorphism and L G V(27*), then 
G V(27*)). Any class of languages satisfying these closure properties is 
called a *-variety of languages. A theorem of Eilenberg [10] says that there is a 
1-1 correspondence between M-varieties and *-varieties of languages: a driving 
theme of the research in automata theory has been to find explicit instantiations 
of this abstract correspondence. 

The M-variety that plays the key role in our work is the so-called block 
groups [20], classically denoted BG. This variety is ubiquitous: it appears in 
topological analysis of languages [20], in questions arising in the study of non- 
associative algebras [6] and in constraint satisfaction problems [14]. It can be 
defined by the following algebraic condition: M is a block group iff for any 
e = and / = /^ in M, eM = fM or Me = Mf implies e = /. For any 
language L, M{L) is a block group iff L is a Boolean combination of languages 
of the form LqQiLi . . . UkL^, where each ai G E and each Li is a language that 
can be recognized by a finite group: this class of languages is the largest *- variety 
that does not contain o27* or E*a for arbitrary alphabet satisfying |27| > 2 [20]. 



2.2 Models 

We adopt the following conventions. Unless otherwise stated, for any machine 
M where these symbols are defined, Q is the set of classical states, 27 is the 
input alphabet, is the initial state, and Qacc U Q {Qrej Q Q) are accepting 
(rejecting) states. If Qacc and Qrej are defined then we require Qacc H Qrej = 0 - 
Also, each model in this paper uses distinct start and endmarkers, 0 and $ 
respectively. On input w, M processes the characters of 0w$ from left to right. 

Let IQI = n. For all QFA in this paper, the state of the machine M is a 
superposition of the n classical states. Superpositions can be expressed mathe- 
matically as vectors in C”. For each q G Q we uniquely associate an element of 
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the canonical basis of C", and we denote this element \q). Now the superposi- 
tion can be written as the vector We say is the amplitude with 

which we are in state We now require each such vector to have an I 2 norm 
of 1, where the I 2 norm || X c«ik*)ll 2 of X oti\qi) is a/X |oip- Superpositions are 
also sometimes called pure states. There are also cases where the quantum state 
of the machine is a random variable; in other words, the state is a ‘classical’ 
probability distribution of superpositions {(_Pi, each ipi with probability pi. 
In this case we say the system is in a mixed state. Mixed states can be expressed 
in terms of density matrices [19], and these are usually denoted p. 

A transformation of a superposition is a linear transformation with respect 
to a unitary matrix. A G is called unitary if A* = A~^, where A* is 

the Hermitian conjugate of A and is obtained by taking the conjugate of every 
element in . Unitary transformations are length preserving, and they are 
closed under product. A set {A^} of transformations is defined for each machine, 
with one transformation for each ctGA'U{0,$}. 

An outside observer cannot gain a priori information about the state of a 
quantum mechanical system except through a measurement operation. A mea- 
surement of a superposition ip probabilistically projects ip onto exactly one of 
j prespecified disjoint subspaces Ei (B ■ ■ ■ (B Ej spanning C”. The index of the 
selected subspace is communicated to the outside observer. For all i, let Pi be 
the projection operator for Ei. Then the probability of projecting into Ei while 
measuring Ei (B ■ ■ ■ (B Ej is IjUii/'Hi- 

We will consider two modes of acceptance. For a probabilistic machine M, 
we say M recognizes L with bounded (two-sided) error if M accepts any w G L 
and rejects any w ^ L with probability at least p, where p > |. We say M 
recognizes L with bounded positive one-sided error if any w G Lis accepted with 
probability p > 0 and any w ^ L is rejected with probability 1. 

Liquid state NMR is the technique used to implement quantum computing 
on 7 quantum bits [26]. NMR uses nuclei of atoms as quantum bits, and the 
state of the machine is a molecule in which 7 different atoms can be individ- 
ually adressed. One of features of NMR is that quantum transformations are 
simultaneously applied to a liquid containing 10^^ molecules. Thus, we have 
the same quantum computation carried out by 10^^ identical quantum comput- 
ers. Applying a measurement is problematic, however. On different molecules, 
the measurement can have a different result. We can determine the fraction of 
molecules that produce each outcome, but we cannot separate the molecules by 
the measurement outcome. Because of that, the operations performed cannot 
be conditional on the outcome of a measurement. On the other hand, measure- 
ments which do not affect the next transformation are allowed. This situation is 
reflected in the definition of our new model, given below: 

Latvian QFA (LQFA). A superset of this model has been studied in [18,5]. A 
LQFA is a tuple M = (Q, U, {Ao-}, {Po-}, <Zo, Qacc) such that {A„} are unitary 
matrices, and {Pa} are measurements (each Pa is defined as a set {Pi, . . . ,Ej} 
of orthogonal subspaces). We define Qrej = Q\Qacc and we require that P$ 




Algebraic Results on Quantum Automata 



97 



is a measurement w.r.t. Eacc © Erejj where Eacc = span{Qacc} and Erej = 
span{Qrej}- Let be the current state. On input a, -ip' = is computed 
and then measured w.r.t. P„. After processing the $, the state of M will be 
in either Eacc or Ecej and M accepts or rejects accordingly. The acceptance 
mode for LQFA is bounded error. This model is introduced as QRA-M-C in the 
classification of QFAs introduced in [12]. 

Also in [12], a probabilistic automata model related to LQFA was introduced, 
which they called ‘1-way probabilistic reversible C-automata’ (we abbreviate 
this to PRA). A PRA is a tuple M = {Q, E,{Acr},qo,Qacc), where each Aa- 
is a doubly stochastic matrix. A matrix is doubly stochastic if the sum of the 
elements in each row and column is 1 . The acceptance mode for PRA is bounded 
error. The two models are related in the following way: If M is a LQFA such 
that each measures with respect to ® span{\q)} for every ct G A, then 
M can be simulated by a PRA. Conversely, a PRA can be simulated by a LQFA 
if each A^ of the PRA has a unitary prototype [12]. A matrix U = [uij] is a 
unitary prototype for S = [s^] if for all z,j: [uijj^ = s^j. When S has a unitary 
prototype it is called unitary stochastic [16]. This relationship between LQFA 
and PRA is helpful in proving that certain languages are recognized by LQFA. 

Brodsky-Pippenger QFA (BPQFA). The BPQFA model is a variation on 
the model introduced by Kondacs and Watrous [15] (we will call this model 
KWQFA). A KWQFA is defined by a tuple M = (Q, E,{Aa},qo,Qacc,Qrej) 
where each A^- is unitary. The state sets Qacc and Qrej will be halt/accept 
and halt/reject states, respectively. We also define Qnon = Q\{Qacc U Qrej) to 
be the the set of nonhalting states. Lastly, for p G {acc, rej, non} we define 
Efj, = span{Q^}, and to be the projection onto P^. Let ip be the current 
state of M. On input cr the state becomes ip' = A^ip and then ip' is measured 
w.r.t. Eacc © Erej © Prion- If after the measurement the state is in Eacc or Erej, 
M halts and accepts or rejects accordingly. Otherwise, ip' was projected into 
Enon and M continues. We require that after reading $ the state is in P„on with 
probability 0. The acceptance mode for KWQFA is bounded error. 

The BPQFA model is one of several variations introduced by Brodsky and 
Pippenger in [8], which they called ‘end-decisive with positive one-sided error’. 
A BPQFA M is a KWQFA where M is not permitted to halt in an accepting 
state until $ is read, and the acceptance mode is changed to bounded positive 
one-sided error. Any BPQFA can be simulated by a KWQFA [8]. 

3 Latvian QFA 

Our main result for this model is a complete characterization of the languages 
recognized by LQFA: 

Theorem 1. LQFA recognize exactly those languages whose syntactic monoid 
is in BG. 

Proof: We begin by showing that the languages recognized by LQFA forms a 
^-variety of languages. It is straightforward to show: 
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Theorem 2. The class of languages recognized by LQFA is closed under union, 
complement, inverse homomorphisms, and word quotient. 

Next, to prove that LQFA cannot recognize any language whose syntactic monoid 
is not in BG, we need to show that LQFA cannot recognize S*a or aS* . We 
note that LQFA are a special case of Nayak’s EQFA model [18], and EQFAs 
cannot recognize S*a. We sketch the proof that aS* is not recognizable below. 

Theorem 3. LQFAs cannot recognize aS* . 

Finally, we prove the following theorem below: 

Theorem 4. LQFAs recognize any language whose syntactic monoid is in BG. 
This will compete the characterization. □ 

Proof of Theorem 3 (sketch) Suppose the LQFA M recognized aS* . Let pw 
be the state of M on reading w as a density matrix. Suppose cr and r are of the 
form a = Y.w(sS<ZaS* PwRw, t = J2w€TCbS'‘ PwPw with J^Pw = ^- By linearity 
we can distinguish between cr and r using with some fixed probability p > 1/2. 
We show that a sequence cti , ct2 , . . . of cr matrices and a sequence ti , T2 , . . . of t 
matrices converge to the same limit, causing a contradiction. 

We will need some notions from quantum information theory [19]. A com- 
pletely positive superoperator is a linear operation that is a completely positive 
map on the space of dxd (particularly, density) matrices. For any density matrix 
p, the Von Neumann entropy S{p) of p is A^logAi, where the A^s are the 
eigenvalues of p. It can be shown that any sequence of unitary transformations 
and measurements forms a CPSO E satisfying S{Ep) > S{p) for any p. 

For all CPSOs E, we define E' to be the (CPSO) operation that performs 
the operation E with probability 1/2, and the identity otherwise. 

Lemma 1. For any CPSO E such that S{Ep) > S{p) and any mixed state p, 
the sequence E'p, {E'Yp . . . converges. Let Enm he the map p — >■ limi^oo {Ejp. 
Then, Eum is a CPSO and S{Eump) > S{p) for any density matrix p. 

Lemma 2. Let A, B he two sequences of unitary transformations and measure- 
ments. Let C — Aii,.jiBiijji and E — Then, Cn^n — 

Let A, B, be the operations corresponding to reading a, b. We also consider 
Alim, Elim, C — AumBum , D — BumAum, Cum and Enm. Let Qa {Qb) be the 
set of density matrices corresponding to all probabilistic combinations of states 
Pax {Pbx)- Let Qa and Qb be the closures of Qa and Qt,. 

Lemma 3. Let p he the state after reading the start marker 0. Then, CumP G Qa 
and DiimP G Qb. 

By Lemmas 2 and 3, there exists sequences corresponding to CumP and 
DiimP, that are respectively probabilistic combinations of pax and pbx, and they 
converge to the same limit. □ 

The next theorem will assist in our proof of Theorem 4. 
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Theorem 5. LQFA can recognize languages of the form E*aiE* . . . auE* . 

Proof: We start with construction of a PRA that recognizes E* a\E* . . . a^E* 
with probability where n is any natural number. We construct our PRA 

inductively on the length of the subword. For /c = 1 we construct = 

as follows. Let = {90,92,- , 9 n}, 

(where 1 is a n x n matrix of all ones), A ^^ = I for all a ^ ai, and 
Q^acc = <3^^^\{9o}- It is easy to check that this machine accepts any w G E*a\E* 
with probability (^) and rejects any w ^ E*aE* with probability 1. 

Assume we have a machine = {Q^'-~^\qQ,E,{A^f ^^},QacP) recog- 
nizing inputs containing the subword ai...ai-i with probability we 

construct = (<5^*^, recognizing inputs containing the 

subword ai . . . Oj with probability (^)L Our augmentation will proceed as fol- 
lows. First let Qacc be a set of (n-1)* new states all distinct from and let 

Qi^) = Q{^-l) y 

For each of the states q G QacP we uniquely associate n-1 
states 92 , • ■ • , 9n G QaL- We leave 90 unchanged. 

Finally, we construct each A^f'^ from A^ Define to be the transfor- 
mation that acts as A „ on C and as the identity elsewhere. We let 

Aa^ = Act Ba\ where Ba'^ is an additional transformation that will process 
the Qi character (note that the matrices are applied from right to left). For all 
cr yf Oj we define B^^ = I . For a = at and we define B^'^ so that, independently 
for each 9 G QacP, the transformation is applied to {9,92,93,... ,9n}- At 
the end we have a machine M = M^^'> that recognizes E*a\E * , . . . , UkE*. 

To simplify notation, we define = Q^Jc = { 90 } and B^^^ = Act ^ for all 
(7. The correctness of the construction follows from this lemma: 

Lemma 4 . Let w be any word. As we proeess w with M, for all 0 < i < k the 
total probability of M being in one of the states of is nonincreasing. 

Proof of Lemma 4 : Every nontrivial A„ matrix can be decomposed into a 
product of Ba} matrices operating on different parts of the state space. All of 
these matrices operate on the machine state in such a way that for any {9, 9'} C 
Qacc, at any time there is an equal probability of being in state 9 or 9'. Thus 
it is sufficient to keep track of the total probability of being in Qdcc For any 
S C Q, denote by P{S) the sum probability of being in one of the states of S. 

For all 0 < i < fc the machine can only move from Q^*) to Q\Q^*^ when 
-Ba)+i ^ is applied, and this matrix has the effect of averaging Qacc . Since 

IQacc^^ I = (?T^-1) IQaccI, it follows that a BaX^ operation will not increase P(Q^®^) 
unless P{QaXc^^) > (n-l)P(gicc)- It can easily be shown by induction on the 
sequence of Baf matrices forming the transitions of M that this condition is 
never satisfied. Thus P(Q*^*^) is nonincreasing for all i. □ 

First we show that any w ^ L is rejected with certainty. The transitions 
are constructed in such a way that M can only move from gb“i) to upon 
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reading Oi, and M cannot move from to in one step (even if ai = 

ai+i). Next we show that any w G L is accepted with probability (^) • After 
reading the first Oi, P{Q^cc) > (^) and by Lemma 4 this remains satisfied until 
02 is read, at which point M will satisfy P(Qacc) > ■ Inductively after 

reading subword o, M satisfies P{Qacc) > ■ Thus M indeed recognizes 

S*aiS* . ..akS*. 

All that remains is to show that we can simulate each using LQFA trans- 
formations. Recall that each is a product of Ba^ matrices operating on dif- 
ferent parts of the state space. If each Ba} has a unitary prototype, then each 
Act could be simulated using the series of I transformations and measurements. 
We first show that we can collapse this operation into one transformation and 
one measurement. Assume we have a sequence of I unitaries Ui on a space E, 
each of them followed by a measurement En © • • • © E^k^- Define a new space 
E' of dimension (dim A) • Hi H I® spanned by states \tp)\ji) ■ ■ ■ \ji), \ip) G E, 
ji G {0,... ,(fci-l)}. Each Ui can be viewed as a transformation on E' that 
acts only on the \tp) part of the state. Replace the measurements by unitary 
transformations Vi defined by: 



Vi\fp)\ji) ■ ■ ■ \j^) ■ ■ ■ \ji) = IV')|ji) • • • \{ji+j) mod ki) ... \ji) 



for |?/>) G Eij. Consider a sequence of I unitaries and I measurements on E. 
Starting from |-!/)), it produces a mixed state {(Pi, IV’i))}) where each {pi^\'4>i)) 
corresponds to a specific sequence of measurement outcomes. Then, if we start 
IV’)|ji) • • • \ji) &nd perform U\, V\, . . . , Ui, Vi and then measure all of ji, . . . , 
ji, the final state is \ipi)\j'i) . . . |j() for some j[, . . . , j[ with probability pi. Thus, 
when we restrict to \tp) part of the state, the two sequences of transformations 
are effectively equivalent. Finally, composing the Ui and Vi transformations gives 
one unitary U and we get one unitary followed by one measurement. It is now 
sufficient to prove that each Pa, has a unitary prototype. 

Observe that any block diagonal matrix such that all of the blocks have 
unitary prototypes is itself a unitary prototype, and that unitary prototypes are 

(i) 

trivially closed under permutations. Each Ba/ can be written as a block diagonal 
matrix, where each block is the 1x1 identity matrix or the matrix, so it 
remains to show that there is a unitary prototype for matrices. Coincidentally 
the quantum Fourier transform matrix [19], which is the basis for most efficient 
quantum algorithms, is a unitary prototype for -1. This completes the proof 
that Act can be simulated by an LQFA, and the proof of the theorem. □ 

Proof of Theorem 4: We give a PRA construction recognizing the language 
L defined by ic G L if and only if w = wgaiWi . . . akWk where for each i, 
woaiwi . . .Wi G Li for some prespecified group languages Lq, . . . ,Lk. By the 
cancellative law of groups, this is sufficient to show that PRA recognize any lan- 
guage of the form L^aiLi . . . akLk. We will see that each transition matrix has 
a unitary prototype, thus there is an LQFA recognizing this language as well. 
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This along with the closure properties of LQFA is sufficient to prove that any 
language whose syntactic monoid is in BG is recognized by an LQFA. 

For all i let Gi = M{Li). Also let ipi : S* ^ Gi and Fi be such that 
= Li. We compose these groups into a single group G = Gq x • • • x 
with identity 1 = (1, 1, . . . , 1). 

Let M = {Q, <7o, F, {A„}, Qacc) be a PRA recognizing the subword ai . . .Uk 
constructed as in Theorem 5. From M we construct M' = (Q', ^q, S, {A'^}, Q'^cc) 
recognizing L. We set Q' = Q x G, = {qo, 1), = Qacc x Fk, and A^ = 

A$ = I. For each a G S define A'^ as follows. Let Fa be the permutation matrix 
that maps (q,g) to (q,ga) for each q G Q and g G G. For each 1 < t < fc let 
A'i^ be the matrix that, for each / G Fi_i, acts as the transformation on 
X {/} and as the identity everywhere else. Finally, = PaA'^i ■ ■ ■ A'^k- 

The A'a are constructed so that M' keeps track of the current group element 
at every step. If M is in state (q,g), then after applying A'^, . . . , A'j, it remains 
in Q X {g} with probability 1. The matrix ‘translates’ all of the transition 
probabilities from Q x {g} to Qx {ga}. Initially M is in Q x {!}, so after reading 
any partial input w, M will be in Q x {Iru} with probability 1. In this way M 
will always keep track of the current group element. 

Each A'a matrix refines A^ from the S* a\S* a 2 ■ ■ ■ cikS* construction in such 
a way that, on input a after reading w, we do not move from to (the 

action performed by Ba}) unless a = at and w G Fi-i. This is exactly what we 
need to recognize L. The transition matrices can be simulated by LQFA by the 
same argument as in Theorem 5. 

Lemma 5. Let w be any word. As we process the characters of w in M, for 
all 0 < i < k the total probability of being in one of the states of Q*-*) x G is 
nondecreasing. 

Proof: Same argument as in Lemma 4 holds. 



Proof of correctness: It is easy to see that M will reject any word not in 
L. We do not move out of x G unless we read oi in the correct context. 
Inductively, we do not move into Qacc unless we have read each subword letter on 
the correct context and the current state corresponds to a group element f G Fk. 
Now suppose w G L. Rewrite w as wqUi ■ ■ ■ akWk. Clearly M does not move out 
of X G while reading wq- The character ai is now read, and M moves 
to X G)\(Q^°^ X G) with probability By the previous lemma, this 

probability does not decrease while reading w\. So now after reading wqUiWi 
we will be in Q^cc x G with probability — . If 02 is read we move to 
with probability (^)^- By induction after reading wgai . . .Wk-iak we move 
to X G)\(Q^*“^^ X G) with total probability at least (^)*. Finally, after 

reading Wk we move to Qacc with total probability at least (^)^, and so we 
accept any w G L with this probability. By choosing a suitable n we can recognize 
L with arbitrarily high probability. □ 
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4 Results for BPQFA 

Our main result for BPQFA is given below: 

Theorem 6. The language L has its syntactic monoid in BG iff it is a Boolean 
combination of languages recognized by BPQFA. 

Proof: Similar to the LQFA case, we first show that this class of languages forms 
a *-variety. BPQFAs have been shown to be closed under inverse homomorphisms 
and word quotient [8], and we get Boolean combinations by definition. Next, we 
give the lower bounds. It is known that BPQFA cannot recognize S*a, since 
KWQFA cannot recognize E*a [15] and any BPQFA can be simulated by a 
KWQFA. This proof can be easily extended to Boolean combinations of BPQFA. 
We prove the following theorem later in the section: 

Theorem 7. The language aS* is not a Boolean combination of languages rec- 
ognized by BPQFA. 

Thus i is a Boolean combination of languages recognized by BPQFA only if 
M{L) is in BG. Finally, we prove the following upper bound, by extending a 
construction of [8] in a manner similar to Theorem 4: 

Theorem 8. Any language whose syntactic monoid is in BG is a Boolean com- 
bination of languages recognized by BPQFA. 

This completes the proof of the main result. □ 

Proof of Theorem 7: We use a technique introduced in [15] to analyze BPQ- 
FAs. Let V’ be an unnormalized state vector of M. Define A'^ = PnonAa, and for 
any word w = wi . . .Wk let = A'^^ ■ ■ ■ A'^^ . Then if if is the start vector, the 
vector i/'tu = completely describes the probabilistic behaviour of M, since 
M halts while reading w with probability 1 — ||V’«i|| 2 ) &nd continues in state 
with probability [['(/’mlli- We also use the following lemma from [4]: 

Lemma 6. [ff Let {x,y} C £■+. Then there are subspaces Ei, E 2 s.t. Enon = 
Ei(BE 2 and (1) if if G Ei, then A'ffif) G Ei and A'y{if) G Ei and ||A^('i/:)|| = Ill/'ll 
and ||A[^('!/))|| = H'i/'ll; (2) if if € E 2 , then for any e > 0, and for any word 
t € (a;|j/)* there exists a word t' G {x\y)* such that ||Ait/ (')/)) || < e. 

We first show that, for any BPQFA M, any e > 0, and for any two prefixes 
v,w € {a, 6}+, there exists v',w' G {a,b}* such that |[A[,j,,'i/' — < e. In 

other words, any input with prefix vv' is indistinguishable from an input with 
prefix ww' by M. Let if = A'^{\qtf)), and let b be some letter in A'\{a}. As in 
Lemma 6, separate Enon into two subspaces Ei and E 2 with respect to the words 
X = a and y = b. Then we can rewrite if as if = if \ -\- if 2 , where ifi & Ei. By the 
lemma, and since A'^ and A], act unitarily on E\, for any e' there exists v' and 
w' such that |]A[,j,,'i/: — V’llli < ^nd ||A[„^,^/; — if i\\ 2 < ■ For sufficiently small 

e' we have \\A'^^,if - A'n,^,if\\l < e. 
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Suppose we have a language L that is a Boolean combination of m languages 
Li, . . . , recognized by BPQFA. As above, we can construct inductively on the 
Li languages two words v = V1V2 ■ ■ ■ Vm G {a, b}* and w = W1W2 ■ ■ • Wm G {a, b}* 
such that av and bw are indistinguishable for every Li. Thus we must have 
either {au, bw} C L or L n {at;, bw} = 0. Either way, the Boolean combination 
of BPQFAs does not recognize aS*. □ 

Note that in our characterization we have to take ‘Boolean combinations’ 
because BPQFA are not closed under complement. This follows from the theorem 
below, which we will prove in the full version: 

Theorem 9. Over any E s.t. (a, b} C E, BPQFA cannot recognize E*bE*aE* . 

5 Conclusion 

In this paper we have produced algebraic characterizations for the languages 
that can be recognized by Brodsky-Pippenger Quantum Finite Automata and 
by a new model which we called Latvian Quantum Finite Automata. A some- 
what surprising consequence of our results is that the two models are equivalent 
in power, up to Boolean combinations. It has been shown that a language L is 
recognizable by an LQFA iff its syntactic monoid is a block group; hence mem- 
bership in the class is decidable. The situation is more complicated for BPQFA 
since the corresponding class of languages is not closed under complement. The 
good news is that we have shown that the class forms what is known as a positive 
*-variety and thus is amenable to algebraic description through the mechanism 
of ordered monoids [22]. We know that this positive *-variety strictly contains 
the regular languages that are open in the group topology and a precise charac- 
terization seems to be within reach. 

Another open problem is to characterize algebraically the Kondacs-Watrous 
model. It is an easy consequence of our results on BPQFA that KWQFA can 
recognize any language whose syntactic monoid is in BG. However, outside of 
BG the question of language recognition is still unresolved. 

The class of languages recognized by KWQFA is known not be closed under 
union [4], hence does not form a *- variety. It is nevertheless meaningful to ask 
for an algebraic description of the *- variety generated by those languages. We 
conjecture that the right answer involves replacing block groups by a 1-sided 
version V of this M- variety defined by the following condition: for any e = and 
/ = /^ in M, eM = fM imply e = /. The corresponding variety of languages 
can be described as largest variety that does not contain E*a for \E\ > 2. 
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Abstract. The oracle identification problem (OIP) is, given a set S of 
M Boolean oracles out of 2^ ones, to determine which oracle in S is the 
current black-box oracle. We can exploit the information that candidates 
of the current oracle is restricted to S. The OIP contains several concrete 
problems such as the original Grover search and the Bernstein- Vazirani 
problem. Our interest is in the quantum query complexity, for which 
we present several upper bounds. They are quite general and mostly 
optimal: (i) The query complexity of OIP is 0{\/N log M log A log log M) 
for any S such that M = \S\ > N, which is better than the obvious 
bound A if M < (y) is 0{y/N) for any S if IS"! = N, 

which includes the upper bound for the Grover search as a special case, 
(iii) For a wide range of oracles (ISI = N) such as random oracles and 
balanced oracles, the query complexity is 0{^ N / K), where A is a simple 
parameter determined by S. 



1 Introduction 

An oracle is given as a Boolean function of n variables, denoted by 
/(xq,... ,Xn-i), and so there are 2^ (or 2^ for N = 2”) different oracles. 
An oracle computation is, given a specific oracle / which we do not know, to de- 
termine, through queries to the oracle, whether or not / satisfies a certain prop- 
erty. Note that / has N black-box 0/1-values, /(O, ... ,0) through /(I, ... , 1). 
(/(O, ... , 0) is also denoted as /(O), /(I, ... , 1) as f{N —1), and similarly for an 
intermediate f{j)-) So, in other words, we are asked whether or not these N bits 
satisfy the property. There are many interesting such properties: For example, 
it is called OR if the question is whether all the N bits are 0 and Parity if the 
question is whether the N bits include an even number of I’s. The most general 
question (or job in this case) is to obtain all the N bits. Our complexity measure 
is the so-called query complexity, i.e., the number of oracle calls, to get a right 
answer with bounded error. Note that the trivial upper bound is N since we can 
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tell all the N bits by asking /(O) through /(iV — 1). If we use a classical com- 
puter, this N is also a lower bound in most cases. If we use a quantum computer, 
however, several interesting speedups are obtained. For example, the previous 
three problems have a (quantum) query complexity of 0{'/N), y and ^ -I- VN, 
respectively [22,12,20,18]. 

In this paper, we discuss the following problem which we call the oracle 
identification problem: We are given a set S oi M different oracles out of the 2^ 
ones for which we have the complete information (i.e., for each of the 2^ oracles, 
we know whether it is in S or not). Now we are asked to determine which oracle 
in S is currently in the black-box. A typical example is the Grover search [22] 
where S = {/o, . . . ,fN-i} and ffij) = 1 iff z = j. (Namely, exactly one bit 
among the N bits is 1 in each oracle in S. Finding its position is equivalent 
to identifying the oracle itself.) It is well-known that its query complexity is 
0{'/N). Another example is the so-called Bernstein-Vazirani problem [14] where 
S = {fo, . . . , /at-i} and ffij) = 1 iff the inner product of i and j (mod 2) is 1. 
A little surprisingly, its query complexity is just one. 

Thus the oracle identification problem is a promise version of the oracle 
computation problem. For both oracle computation and oracle identification 
problems, Ambainis developed a very general method for proving their lower 
bounds of the query complexity [4]. Also, many nontrivial upper bounds are 
known as mentioned above. However all those upper bounds are for specific 
problems such as the Grover search; no general upper bounds for a wide class of 
problems have been known so far. 

Our Contribution. In this paper, we give general upper bounds for the 
oracle identification problem. More concretely we prove: (i) The query complex- 
ity of the oracle identification for any set S is 0{^yN log M log N log log M ) if 
\S\ = M > N. (ii) It is 0{'/N) for any S if [S'] = N. (iii) For a wide range 
of oracles (M = N) such as random oracles and balanced oracles, the query 

complexity is where AT is a parameter determined by S. The bound in 

(i) is better than the obvious bound N ii M < Both algorithms for 

(i) and (ii) are quite tricky, and the result (ii) includes the upper bound for the 
Grover search as a special case. Result (i) is almost optimal, and results (ii) and 
(iii) are optimal; to prove their optimality we introduce a general lower bound 
theorem whose statement is simpler than that of Ambainis [4] . 

Related Results. Query complexity has constantly been one of the central 
topics in quantum computation; to cover everything is obviously impossible. For 
the upper bounds of the query complexity, the most significant result is due 
to Grover [22], known as the Grover search, which also derived many applica- 
tions and extensions [10,12,16,23,24]. In particular, some results showed efficient 
quantum algorithms by combining the Grover search with other (quantum and 
classical) techniques. For example, quantum counting algorithm [15] gives an ap- 
proximate counting method by combining the Grover search with the quantum 
Fourier transformation, and quantum algorithms for the claw-finding and the 
element distinctness problems [13] also exploit classical random sampling and 
sorting. Most recently, Ambainis developed an optimal quantum algorithm with 
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queries for element distinctness problem [6], which makes use of quan- 
tum walk and matches to the lower bounds shown by Shi [26]. Aaronson and 
Ambainis also showed an efficient quantum search algorithm for spacial regions 
[3] based on recursive Grover search, which is applicable to some geometrically 
structured problems such as search on a 2-D grid. 

On the lower-bound side, there are two popular techniques to derive quantum 
lower bounds, i.e., the polynomial method and the quantum adversary method. 
The polynomials method was firstly introduced for quantum computation by 
[9] who borrowed the idea from the classical counterpart. For example, it was 
shown that for bounded error cases, evaluations of AND and OR functions need 
0{VN) number of queries, while parity and majority functions at least N/2 and 
0{N), respectively. Recently, [1,26] used the polynomials method to show the 
lower bounds for the collisions and element distinctness problems. 

The classical adversary method was used in [11,27], which is also called the 
hybrid argument. Their method can be used, for example, to show the lower 
bound of the Grover search. As mentioned above, Ambainis introduced a quite 
general method, which is known as the quantum adversary argument, for obtain- 
ing lower bounds of various problems, e.g., the Grover search, AND of ORs and 
inverting a permutation [4]. [7] recently established a lower bound of on 

the bounded-error quantum query complexity of read-once Boolean functions by 
extending [4], and [8] generalized the quantum adversary method from the as- 
pect of semidefinite programming. Furthermore, [17,2] showed the lower bounds 
for graph connectivity and local search problem respectively using the quan- 
tum adversary method. Ambainis also gave a comparison between the quantum 
adversary method and the polynomial method [5]. 

2 Formalization 

Our model is basically the same as standard ones (see e.g., [4]). An oracle is 
given as a Boolean function /{xq, . . . of n variables, which transfers a 

quantum state from [xq,... ,x„_i)| 6) to (— [xq, . . . ,x„_i)|6). A 
quantum computation is a sequence of unitary transformations Uq ^ O ^ Ui ^ 
O ^ ^ O ^ Ut, where O is a single oracle call against our black-box oracle 

(sometimes called an input oracle), and Uj may be any unitary transformation 
without oracle calls. The above computation sequence involves t oracle calls, 
which is our measure of the complexity {the query complexity). Let N = 2^ and 
hence there are 2 ^ different oracles. 

Our problem is called the Oracle Identification Problem {OIP). An OIP is 
given as an infinite sequence Si, S 2 , S 4 , . . . - Each Sn {N = 2",n = 

0, 1, . . . ) is a set of oracles (Boolean functions with n variables) whose size, \Sn\, 
is denoted by M (< 2^). A (quantum) algorithm A which solves the OIP is a 
quantum computation as given above. A has to determine which oracle (g Sn) 
is the current input oracle with bounded error. If A needs at most g{N) oracle 
calls, we say that the query complexity of A is g{N). It should be noted that A 
knows the set Sn completely; what is unknown for A is the current input oracle. 
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For example, the Grover search is an OIP whose Sn contains N (i.e., M = N) 
Boolean functions fi, ■ ■ ■ ,Jn such that 



fi{j) = 1 iff i = j. 

Note that /(j) means /(oo,ai,... ,a„_i) (oj = 0 or 1) such that oo,... ,an-i 
is the binary representation of the number j. Note that Sn is given as a N x M 
Boolean matrix. More formally, the entry at row i (0 < i < M —1) and column j 
{0 S j ^ N — 1) shows /i(j). Fig. 1 shows such a matrix of the Grover search for 
N = M = 16. Each row corresponds to each oracle in Sn and each column to its 
Boolean value. Fig. 2 shows another famous example given by an fV x fV matrix, 
which is called the Bernstein- Vazirani problem [14]. It is well known that there 
is an algorithm whose query complexity is just one for this problem [14]. 

As described in the previous section, there are several similar (but different 
subtly) settings. For example, the problem in [25,4] is given as a matrix which 
includes all the rows (oracles) each of which contains N/2 I’s or (1/2 -|- e)N I’s 
for e > 0. We do not have to identify the current input oracle itself but have only 
to answer whether the current oracle has N/2 I’s or not. (The famous Deutsch- 
Jozsa problem [19] is its special case.) The Ftarget Grover search is given as a 
matrix consisting of all (or a part of) the rows containing I I’s. Again we do 
not have to identify the current input oracle but have to answer with a column 
which has value 1 in the current input. Fig. 3 shows an example, where each row 
contains N/2 + 1 ones. One can see that the multi-target Grover search is easy 
(0(1) queries are enough since we have roughly one half I’s), but identifying the 
input oracle itself is much harder. 

In [4], Ambainis gave a very general lower bounds for oracle computation. 
When applying to the OIP (the original statement is more general), it claims 
the following: 



Proposition 1. Let Sn be a given set of oracles, and X,Y he two disjoint 
subsets of Sn- Let R C X x Y be such that 

1. For every fa € X, there exist at least m different fb & Y such that {fa, fb) G R- 

2. For every fb € Y, there exist at least m' different fa & X such that {fa, fb) G 



R. 

Let If^^i he the number of fb G Y such that {fa,fb) G R and fa{i) ^ fb{i) eind 
lf,_^i he the number of fa G X such that {fa,fb) G R and fa{i) ^ fb{i)- Let Imax 
he the maximum of lf^^ilf,^^i over all {fa,fb) G R and i G {0,... ,iV — 1} such 



that fa{i) ^ fb{i)- Then, the query complexity for Sn 




In this paper, we always assume that M > N . li M < N/2, then we can 
select M columns out of the N ones while keeping the uniqueness property of 
each oracle. Then by changing the state space from n bits to at most n — 1 bits, 
we have a new M x M matrix, i.e., a smaller OIP problem. 



3 General Upper Bounds 

As mentioned in the previous section, we have a general lower bound for the 
OIP. But we do not know any nontrivial general upper bounds. In this section. 
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Fig. 1. /i(j) = 1 iff i = j 



Fig- 2. fi (j) = i- j = mod 2 



we give two general upper bounds for the case that M > N and for the case 
that M = N. The former is almost tight as described after the theorem, and 
the latter includes the upper bound for the Grover search as a special case. An 
N X M OIP denotes an OIP whose Sn (or simply S by omitting the subscript) 
is given as an N x M matrix as described in the previous section. Before proving 
the theorems, we introduce a convenient technique called a Column Flip. 

Column Flip. Suppose that S is any N x M matrix (a set of M oracles). 
Then any quantum computation for S can be transformed into a quantum com- 
putation for an NxM matrix S such that the number of I’s is less than or equal 
to the number of O’s in every column. (We say that such a matrix is 1-sensitive.) 
The reason is straightforward. If some column in S holds more I’s than O’s, then 
we “flip” all the values. Of course we have to change the current oracle into the 
new ones but this can be easily done by adding an extra circuit to the output of 
the oracle. 

Theorem 1. The query eomplexity of any NxM OIP is 0{\/ N log M log N 
log log M) if M > N. 

Proof. To see the idea, we first prove an easier bound, i.e., 
0{VN log MloglogM). (Since M can be an exponential function in N, 
this bound is significantly worse than that of the theorem.) If necessary, we 
convert the given matrix S to be I-sensitive by Column Flip. Then, just apply 
the Grover search against the input oracle. If we get a column j (the input 
oracle has 1 there), then we can eliminate all the rows having 0 in that column. 
The number of such removed rows is at least one half by the 1-sensitivity. Just 
repeat this (including the conversion to 1-sensitive matrices) until the number 
of rows becomes 1, which needs O(logM) rounds. Each Grover Search needs 
0{'/N) oracle calls. Since we perform many Grover searches, the log log M term 
is added to take care of the success probability. 

In this algorithm we counted 0{'/N) oracle calls for the Grover search, which 
is the target of our improvement. More precisely, our algorithm is the following 
quantum procedure. Let S = {/o, ..., /m-i} be the given NxM matrix: 
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Step 1. Let Z C S' be a set of candidate oracles (or equivalently a,n N x M 
matrix each row of which corresponds to each oracle). Set Z = S initially. 

Step 2. Repeat Steps 3-6 until \Z\ = 1. 

Step 3. Convert Z into 1-sensitive matrix. 

Step 4- Compute the largest integer K such that at least one half rows of Z 
contain K I’s or more. (This can be done simply by sorting the rows of Z with 
the number of I’s.) 

Step 5. For the current (modified) oracle, perform the multi-target Grover 
search [12] where we set | to the maximum number of oracle calls. Iterate 

this Grover search log log M times (to increase the success probability). 

Step 6. If we succeeded in finding 1 by the Grover search in the previous step, 
i.e., a column j such that the current oracle actually has 1 in that column, then 
eliminate all the rows of Z having 0 in their column j. (Let Z be this reduced 
matrix.) Otherwise eliminate all the rows of Z having at least K I’s. 

Now we estimate the number of oracle calls in this algorithm. Let Mr and Kr 
be the number of the rows of Z and the value of K in the r-th repetition respec- 
tively. Initially, Mi = M. Note that the number of the rows of Z becomes \Z\j2 
or less after Step 6, i.e., < Mr/2 even if the Grover search is successful or 

not in Step 5 since the number of I’s in each column of the modified matrix is 
less than \Z\/2 and the number of the rows which have at least K I’s is \Z\/2 
or more. Assuming that we need the T repetitions to identify the current input 
oracle, the total number of the oracle calls is 

log log M. 




We estimate the lower bounds of Kr- Note that there are no identical rows in Z 

in the 



and the number of possible rows that contain at most Kr I’s is ^ 



r-th repetition. Thus, it must hold that ^ 



■ Since 



< 



2N^'', Kr = I? i ^°o^ ) ii^ > N, otherwise Kr > 1. Therefore the number of 
the oracle calls is at most 



y/N log log M ^ ^ log log M log N, 



where the number of rows of Z becomes N or less after the T'-th repetition. For 
{Ml, ...,Mt'}, there exists a sequence of integers [ki, ...,kT'} (1 < fci < • • • < 
kr' < logM) such that 



, ^ M ^ M 

1 — r <C M'jpf ^ — r 

— ^ 






since Mr/2 > Mr+i for r = 1, ...,T'. Thus, we have 

T' T' logM-l 

y , <y^= < y , <2yogM. 

yog(M/2fci) - VlogM-i - 

Then, the total number of the oracle calls is O (i/ N log M log N log log M) . 
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Next, we consider the success probability of our algorithm. By the analysis 
of the Grover search in [12], if the number of I’s of the current modified ora- 
cle is larger than Kr in the r-th repetition, then we can find 1 in the current 
modified oracle with probability at least 1 — (3/4)*°s^°8^. This success proba- 
bility worsens after T rounds of repetition but still keeps a constant as follows: 
(1 - (3/4)iog log M)T > (1 _ l/logM)'°s^ = 17(1). □ 



Theorem 2. There is an OIP whose query complexity is logM). 

Proof. This can be shown in the same way as Theorem 5.1 in [4] as follows. 
Let X be the set of all the oracles whose values are 1 at exactly K positions 
and Y be the set of all the oracles that have I’s at exactly K + 1 positions. 
We consider the union of X and Y for our oracle identification problem. Thus, 

M = |7f| + 1^1 = l)’ therefore, we have logM < iLlogiV. 

Let also a relation R be the set of all (/,/') such that f G X, f GY and they 
differ in exactly a single position. Then the parameters in Theorem 5.1 in [4] 

take values m = = N — K, m' = = K + 1 and I = I' = 1. 

Thus the lower bound is l7(y^(fV — K){K -1-1)). Since logM = 0{KlogN), K 
can be as large as 17( *°|^ ), which implies our lower bound. □ 

Thus the bound in Theorem 1 is almost tight but not exactly. When M = N, 
however, we have another algorithm which is tight within a factor of constant. 
Although we prove the theorem for M = N, it also holds for M = poly{N). 

Theorem 3. The query complexity of any N x N OIP is 0{'/N). 



Proof. Let S be the given N x N matrix. Our algorithm is the following proce- 
dure: 

Step 1. Let Z = S.li there is a column in Z which has at least O’s and 
at least ^/~N I’s, then perform a classical oracle call with this column. Eliminate 
all the inconsistent rows and update Z. 

Step 2. Modify Z to be 1-sensitive. Perform the multi-target Grover search 
[12] to obtain column j. 

Step 3. Find a column k which has 0 and 1 in some row while the column j 
obtained in the Step 2 has 1 in that row (there must be such a column because 
any two rows are different). Perform a classical oracle call with column k and 
remove inconsistent rows. Update Z. Repeat this step until \Z\ = 1. 

Since the correctness of the algorithm is obvious, we only prove the complex- 
ity. A single iteration of Step 1 removes at least '/N rows, and hence we can 
perform at most '/N iterations (at most '/N oracle calls). Note that after this 
step each column of Z has at most '/N O’s or at most '/N I’s. Since we perform 
the Golumn Flip in Step 2, we can assume that each column has at most '/N 
I’s. The Grover search in Step 2 needs 0{'/N) oracle calls. Since column j has 
at most VN I’s, the classical elimination in Step 3 needs at most '/N oracle 
calls. □ 
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4 Tight Upper Bounds for Small M 



In this section, we investigate the case that M = N in more detail. Note that 
Theorem 3 is tight for the whole N x N OIP but not for its subfamilies. (For ex- 
ample, the Bernstein-Vazirani needs only 0(1) queries.) To seek optimal bounds 
for subfamilies, we introduce the following parameter: Let S be an OIP given as 
an N X M matrix. Then #(£') be the maximum number of I’s in a single column 
of the matrix. We first give a lower bound theorem in terms of this parameter, 
which is a simplified version of Proposition 1. 



Theorem 4. Let S be an N x M matrix and K = #(£'). Then S needs 
queries. 



Proof. Without loss of generality, we can assume that S is 1-sensitive, i.e., K < 
M/2. We select X {Y, resp.) as the upper (lower, resp.) half of S (i.e., \X\ = 
|y| = M/2) and set R = X xY (i.e., {x,y) € R for every x € X and y G Y). Let 
Sj be the number of I’s in the j-th column of Y. Now it is not hard to see that 
we can set m = m' = ^, Ixjlyj = max{<5j(^ — Kj + Sj), (^ — 5j){Kj — <5j)} 
where Kj is the number of I’s in column j. Since Kj < K, this value is bounded 

from above by ^K. Hence, Proposition 1 implies 17 ^ 



'-K 



Although this lower bound looks much simpler than Proposition 1, it is 
equally powerful for many cases. For example, we can obtain n{y/N) lower 
bound for the OIP given in Fig. 3 which we denote by X. Note in general that 
if we need t queries for an matrix S, then we also need at least t queries for any 
S if S. Therefore it is enough to obtain a lower bound for the matrix X which 
consists of the N/2 upper-half rows of X and all the I’s of the right half can be 
changed to O’s by the Column Flip. Since f/{X ) = 1, Theorem 4 gives us an 
lower bound of L2{\fN). 

Now we give tight upper bounds for three subfamilies of TV x matrices. 
The first one is not a worst-case bound but an average-case bound: Let AV (K) 
he an N X N matrix where each entry is 1 with the probability K/N . 

Theorem 5. The query eomplexity for AV{K) is N/ K) with high proba- 

bility if K = N°" for 0 < a < 1. 



Proof. Suppose that X is an AV{K). By using a standard Chernoff-bound ar- 
gument, we can show that the following three statements hold for X with high 
probability (Proofs are omitted), (i) Let Cj be the number of I’s in column i. 
Then for any i, 1/2K < Ci < 2K. (ii) Let rj be the number of I’s in row j. Then 
for any j, 1/2K < rj < 2K. (iii) Suppose that D is a set of any d columns in X 
{d is a function in a which is constant since a is a constant). Then the number 
of rows which have I’s in all the columns in D is at most 2 log N . 

Our lower bound is immediate from (i) by Theorem 4. For the upper bound, 
our algorithm is quite simple. Just perform the Grover search independently d 
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times. Each single round needs N/ K) oracle calls by (ii). After that the 

number of candidates is decreased to 2 log N by (iii) . Then we simply perform 
the classical elimination, just as step 3 of the algorithm in the proof of Theo- 
rem 3, which needs at most 21ogiV oracle calls. Since d is a constant, the overall 
complexity is 0{^J N / K) + log Af = N / K ii K = iV“. □ 

The second subfamily is called a balanced matrix. Let B{K) be a family of 
N X N matrices in which every row and every column has exactly K I’s. (Again 
the theorem holds if the number of I’s is 0{K).) 

Theorem 6. The query complexity for B{K) is 0{yj N / K) if K < 

Proof. The lower-bound part is obvious by Theorem 4. The upper-bound part 
is to use a single Grover search -|- K classical elimination. Thus the complexity 
is 0{y^N/K + K), which is 0{^ N / K) ii K < □ 

The third one is somewhat artificial. Let H{k), called an hybrid matrix be- 
cause it is a combination of Grover and Bernstein-Vazirani, be an matrix defined 
as follows: Let a = (oi, 02 , . . . , Un-fe, a„_fc+i, . . . , a„) and 

x = (xi,X2,... ,Xn-k,Xn~k+i,--- ,x„). Then fa(x) = 1 iff (i) (ai,...,a„_fc) = 
(a;i, ...,Xn-k) and (ii) (a„_fc+i, ...,a„)-(a:„_fe+i, ...,x„) = 0 (mod 2). Fig. 4 shows 
the case that k = 2 and n = 4. 

Theorem 7. The query complexity for H{k) is 0{y/N/K), where K = 2^ . 

Proof. We combine the Grover search [22,12] with BV algorithm[14] to identify 
the oracle fa by determining the hidden value a oi fa. We first can determine the 
first n—k bits of a. Fixing the last k bits to |0), we apply the Grover search using 
oracle fa for the first n — k bits to determine Oi, ...,a„_fc. It should be noted 
that /a(ai,... ,a„_fe,0, ... , 0) = 1 and fa(xi,... ,x„_fe,0, ... ,0) = 0 for any 
xi, ...,Xk yf oi, ...,Qk. Next, we apply BV algorithm to determine the remaining 
k bits of a. This algorithm requires N/ K) queries for the Grover search 

and 0(1) queries for BV algorithm to determine a. Therefore we can identify 
the oracle fa using 0{^/nJK) queries. □ 



5 Classical Lower and Upper Bounds 

The lower bound for the general N x M OIP is obviously N ii M > N. When 
M = N, we can obtain bounds being smaller than N for some cases. 

Theorem 8. The deterministie query complexity for N xN OIP S with ff{S) = 
K is at least [^J -I- [log AT] — 2. 

Proof. Let fa be the current input oracle. The following proof is due to the 
standard adversary argument. Let A be any deterministic algorithm using the 
oracle fa. Suppose that we determine a G {0, 1}” to identify the oracle fa. Then 
the execution of A is described as follows: (i) In the first round, A calls the oracle 
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Fig. 3. Our problem is much harder 



Fig. 4. H (k) with n = 4 and fc = 2 



with the predetermined value xg and the oracle answers with dg = fa(xg). (ii) 
In the second round, A calls the oracle with value x\, which is determined by 
dg and the oracle answers with d\ = fa(xi). (iii) In the (i + I)-st round, A calls 
the oracle with Xi which is determined by dg,di, di-i and the oracle answers 
with di = fa{xi). (iv) In the m-th round A outputs a which is determined by 
dg, d \, ..., dm-i and stops. Thus, the execution of A is completely determined by 
the sequence {dg,di, ...,dm-i) which is denoted by A{a). (Obviously, if we fix a 
specific a, then A{a) is uniquely determined). 

Let mg = [N/K\ + [log K\—3 and suppose that A halts in the mo-th round. 
We compute the sequence (co,Ci,... ,Cmg), Ci G {0,1}, and another sequence 
{Lg,Li,... ,Lfno), Li C |a|a G {0,1}”}, as follows (note that cg,... ,Cmg are 
similar to dg, ...,dm-i above and are chosen by the adversary): (i) Lg = {0, 1}”. 
(ii) Suppose that we have already computed Lg, ..., Li, and cg, ..., Ci-\. Let Xi be 
the value with which A calls the oracle in the {i + l)-st round. (Recall that Xi is 
determined by Cg, ...,Ci_i.) Let = {s | fs{xi) = 0} and = {s | fs{xi) = 1}. 
Then if \Li fl LP\ > \Li fl L^ \ then we set Ci = 0 and fl Otherwise, 

i.e., if \Li n L°| < \Li 0 L^\, then we set Ci = 1 and Lj+i = LiC\ L^. 

Now we can make the following two claims. 

Claim 1. \Lmo \ > 2. (Reason: Note that |Lo| = Lf and the size of Li decreases 
as i increases. By the construction of Li, one can see that until \Li\ becomes 2K, 
its size decreases additively by at most K in a single round and after that it 
decreases multiplically at most one half. The claim then follows by a simple 
calculation.) 

Claim 2. If a G Lmg, then (cq, . . . ,Cmg) = A{a). (Reason: Obvious since 
a G To O Li n • • • n Lmg-) 

Now it follows that there are two different Oi and 02 in L^g such that ^(oi) = 
A{a 2 ) by Claims 1 and 2. Therefore A outputs the same answer for two different 
oi and 02 , a contradiction. □ 

For the classical upper bounds, we only give the bound for the hybrid matrix. 
Similarly for AV{K) and B{K). 

Theorem 9. The deterministic query complexity for H{k) is 0{^ + log AT). 
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Proof. Let fa be the current input oracle. The algorithm consists of an exhaustive 
and a binary search to identify the oracle fa by determining the hidden value 
a of fa- First, we determine the first n — k bits of a by fixing the last k bits to 
all O’s and using exhaustive search. Second, we determine the last k bits of a by 
using binary search. This algorithm needs 2”“^(= queries in the exhaustive 
search, and 0(k)(= 0(logK)) queries in the binary search. Therefore, the total 
complexity of this algorithm is 0(2"~^ + k) = 0(^ + logK). □ 

6 Concluding Remarks 

Some future directions are as follows: The most interesting one is a possible im- 
provement of Theorem 1, for which our target is 0(\/Nlog M). Also, we wish to 
have a matching lower bound, which is probably possible by making the argu- 
ment of Theorem 2 a bit more exact. As mentioned before, in a certain situation, 
we do not have to determine the current oracle completely but have only to do 
that ’’approximately”, e.g., have to determine whether it belongs to some subset 
of oracles. It might be interesting to investigate how this approximation makes 
the problem easier (or basically not). 
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Abstract. Motivated by problems of pattern statistics, we study the limit distri- 
bution of the random variable counting the number of occurrences of the symbol 
a in a word of length n chosen at random in {a, b}*, according to a probability 
distribution defined via a finite automaton equipped with positive real weights. We 
determine the local limit distribution of such a quantity under the hypothesis that 
the transition matrix naturally associated with the finite automaton is primitive. 
Our probabilistic model extends the Markovian models traditionally used in the 
literature on pattern statistics. 

This result is obtained by introducing a notion of symbol-periodicity for irre- 
ducible matrices whose entries are polynomials in one variable over an arbitrary 
positive semiring. This notion and the related results we prove are of interest in 
their own right, since they extend classical properties of the Perron-Frobenius 
Theory for non-negative real matrices. 

Keywords: Automata and Formal Languages, Pattern statistics. Local Limit The- 
orems, Perron-Frobenius Theory. 



1 Introduction 

A typical problem in pattern statistics studies the frequency of occurrences of given 
strings in a random text, where the set of strings (patterns) is fixed in advance and the text 
is a word of length n randomly generated according to a prohahilistic model (for instance, 
a Markovian model). In this context, relevant goals of research concern the asymptotic 
evaluations (as n grows) of the mean value and the variance of the number of occurrences 
of patterns in the text, as well as its limit distribution. This kind of problems are widely 
studied in the literature and they are of interest for the large variety of applications in 
different areas of computer science, probability theory and molecular biology (see for 
instance [8,12,1 1,14]). Many results show a normal limit distribution of the number of 
pattern occurrences in the sense of the central or local limit theorem [1]; here we recall 

* This work has been supported by the Project M.I.U.R. COFIN 2003-2005 “Formal languages 
and automata: methods, models and applications”. 
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that the “local” result is usually stronger since it concerns the probability of single point 
values, while the “central” limit refers to the cumulative distribution function. In [10] 
limit distributions are obtained for the number of (positions of) occurrences of words 
from a regular language in a random string of length n generated in a Bernoulli or a 
Markovian model. These results are extended in [3] to the so-called rational stochastic 
model, where the pattern is reduced to a single symbol and the random text is a word 
over a two-letter alphabet, generated according to a probability distribution defined via 
a weighted finite automaton or, equivalently, via a rational formal series. The symbol 
frequency problem in the rational model includes, as a special case, the general frequency 
problem of regular patterns in the Markovian model studied in [10]. In the same paper 
[3], a normal local limit theorem is obtained for a proper subclass of primitive models. 
In this paper, we present a complete solution for primitive models, i.e. when the matrix 
associated with the rational formal series (counting the transitions between states) is 
primitive. 

We now turn to a brief description of this paper. In Section 3, we introduce a notion 
of x-periodicity for irreducible matrices whose entries are polynomials in the variable x 
over an arbitrary positive semiring. Intuitively, considering the matrix as a labeled graph, 
its x-period is the GCD of the differences between the number of occurrences of x in 
(labels of) cycles of the same length. This notion and the related properties we prove 
are of interest in their own right, since they extend the classical notion of periodicity 
of non-negative matrices, studied in the Perron-Frobenius Theory for irreducible and 
primitive matrices [13]. In particular, these results are useful to study the eigenvalues of 
matrices of the form Ax + B, where A and B are matrices with coefficients in M+ and 
X G C with |x| = 1 (see Theorem 2). 

In Section 4 we prove our main result, concerning the local limit distribution of the 
random variable representing the number of occurrences of the symbol a in a word of 
length n chosen at random in {a, b}*, according to any primitive rational model. Such a 
model can be described by means of a primitive matrix of the form Ax + B, where A and 
B are non-negative real matrices. If Ax + B has x-period d, then we prove the existence 
of positive real constants a, f3 and non-negative real constants Cq,Ci, . . . , Cd-i with 
^ Cj = 1 such that, as n tends to oo, the relation 



P{Yn = k} 



d Cm, (fc— /3n)^ 

^ . g 2 ^ 

s/2TTan 



+ o 




holds uniformly for each k = 0, 1, . . . ,n (here {k)d = k — [/c/dj). If, in particular, 
d = 1 we get a normal local limit distribution, as already stated in [3]. 



2 Preliminaries 

In this section we recall some basic notions and properties concerning rational formal 
series [2] and matrices over positive semirings [13]. 
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2.1 Rational Formal Series and Weighted Automata 

Let 5 be a positive semiring [9] , that is a semiring such that x + y = 0 implies x = y = 0 
and X ■ y = 0 implies x = 0 or y = 0 . Examples are given by N, K+ or the Boolean 
algebra B. Given a finite alphabet S, we denote by E* the set of all finite strings over 
E and by 1 the empty word. Moreover, for each w G E*, we denote by |w| its length 
and by |r<;|b the number of occurrences of the symbol b G E in w. 

We recall that a/ormfl/jenei over with coefficients in 5 is a function r : E* — S. 
Usually, the value of r at w is denoted by (r, w) and we write r = 

Moreover, r is called rational if there exists a linear representation, that is a triple 
(^, y, rj) where, for some integer m > 0, ^ and y are (column) vectors in 5™ and 
y : E* — is a monoid morphism, such that (r, w) = y{w) y holds for each 
w G E* . We say that m is the size of the representation. Observe that considering such 
a triple {^,y,y) is equivalent to defining a (weighted) non-deterministic automaton, 
where the state set is given hy { 1 , 2 , . . . , m} and the transitions, the initial and the final 
states are assigned weights in Shy y, ^ and y respectively. 

It is convenient to represent the morphism y by its state diagram, see Figure 1 , which 
is a labeled directed graph where the vertices are given by the set { 1 , 2 , . . . , m} and 
where there exists an edge with label b G E from vertex p to vertex q if y{b)pq ^ 0. A 
path of length n is a sequence of labeled edges of the form 

« = «o — >qi — >q2--- qn-i — q-n ; 

in particular, if = go we say that f is a go-cycle. Moreover we say that w = &i 62 ■ • • 
is the label of f and we denote by \£\i, = |w|h the number of occurrences of b in £. 

Since we are interested in the occurrences of a particular symbol a G E, we may 
set A = y{a), B = ^^icl consider the a-counting matrix M(x) = Ax + B, 

which can be interpreted as a matrix whose entries are polynomials in 5[x] of degree 
lower than 2. Moreover, observe that for every n G N we can write 

^^M(x)”? 7 = ^ (r, w) • xI™I“ . ( 1 ) 

|LL»|=n 

Therefore M (x)" is related to the paths of length n of the associated state diagram, 
in the sense that the pg-entry of M (x)" is the sum of monomials of the form sx^ where 
k = \£\a for some path £ of length n from p to g in the state diagram. 

2.2 Matrix Periodicity 

We now recall the classical notion of periodicity of matrices over positive semirings. 
Given a hnite set Q and a positive semiring S, consider a matrix M : Q x Q ^ S. We 
say that M is positive whenever Mpq ^ 0 holds for all p,q G Q, in which case we write 
M > 0. 

To avoid the use of brackets, from now on, we use the expression M'^pq to denote the 
pg-entry of the matrix M”. For every index g, we call period of g the greatest common 
divisor (GCD) of the positive integers h such that M^qg ^ 0, with the convention that 
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M = 



/OlO x\ 
0 0 ® 0 
10 0 0 
V® 0 0 0/ 



Fig. 1. Example of state diagram and a-counting matrix 



GCD( 0 ) = + 00 . Moreover, we recall that a matrix M is said to be irreducible if for 
every pair of indices p, q, there exists a positive integer h = h{p, q) such that M^pq ^ 0 ; 
in this case, it turns out that all indices have the same period, which is finite and is called 
the period of M. Finally, the matrix is called primitive if there exists a positive integer 
h such that > 0, which implies M" > 0 for every n > h.lt is well-known that M 
is primitive if and only if M is irreducible and has period 1 . 

When S is the semiring of positive real numbers an important result is given by the 
following theorem (see [13]). 

Theorem 1 (Perron-Frobenius). Let M tea primitive matrix with entries in K+. Then, 
M admits exactly one eigenvalue X of maximum modulus ( called the Perron-Frobenius 
eigenvalue ofM ), which is a simple root of the characteristic polynomial ofM. Moreover, 
A is real and positive and there exist strictly positive left and right eigenvectors u and v 
associated with A such that v^u = 1. 

A consequence of this theorem is that, for any primitive matrix M with entries in M+, 
the relation M" ^ A" • holds as n tends to +oo, where A, u and v are dehned as 
above. A further application is given by the following proposition [13, Exercise 1.9], to 
be used in the next sections. 

Proposition 1. Let C be a complex matrix, set \C\ = (IC'pgl) and let 7 be one of the 
eigenvalues of C. If M is a primitive matrix over such that \Cpq\ < Mpqfor every 
p, q and ifX is its Perron-Frobenius eigenvalue, then I 7 I < A. Moreover, if\y\ = A, then 
necessarily \C\ = M. 

3 The Symbol-Periodicity of Matrices 

In this section we introduce the notion of x-periodicity for matrices in the semiring 5[a;] 
of polynomials in the variable x with coefficients in S and focus more specifically on 
the case of irreducible matrices. 

3.1 The Notion of a; -Periodicity 

Given a polynomial F = fkX^ G 5[a:], we define the x-period of F as the integer 
d{F) = GCD{|h — k\ \ fh H fk}, where we assume GCD({0}) = GCD( 0 ) = 
+ 00 . Observe that d{F) = +00 if and only if F = 0 or F is a monomial. 
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Now consider a finite set Q and a matrix M : Q x Q ^ S[x]. For any index q G Q 
and for each integer n we set d{q,n) = d{M'^qq) and we define the x-period of q 
as the integer d{q) = GCD {d{q,n) \ n > 0 }, assuming that any non-zero element 
in N U {-foo} divides -foo. Notice that if M is the a-counting matrix of some linear 
representation, this definition implies that for every index q and for every pair of g-cycles 
Cl and C2 of equal length, |Ci |a — |C2|a is a multiple of d{q). 

Proposition 2 . If M is an irreducible matrix over 5[a;], then all indices have the same 
x-period. 

Proof. Consider an arbitrary pair of indices p, q. By symmetry, it suffices to prove that 
d{p) divides d{q), and this again can be proven by showing that d{p) divides d{q, n) for 
all n G N. As M is irreducible, there exist two integers s, t such that M'^pq 0 M*qp. 
Then the polynomial M^^*pp = 'J2r M'^pr-M^rp ^ 0 and for some fc G N there exists 
a monomial in M^~^^pp with exponent k. Therefore, for every exponent h in M'^qq, the 
integer h + k appears as an exponent in This proves that (i(p, n + s + t) 

divides d{q, n) and since d{p) divides d{p, n + s + t), this establishes the result. □ 



Definition 1. The x-period of an irreducible matrix over 5 [x] is the common x-period 
of its indices. 



Example 1. We compute the x-period of the matrix M over B[x] corresponding to the 
state diagram represented in Figure 1 . Consider for instance state qi and let Ci and C2 
be two arbitrary gi -cycles having the same length. Clearly they can be decomposed 
by using the simple gi -cycles of the automaton, namely ii = qi — ^ 54 — ^ <Zi, ^2 = 
qi — ^ <72 qs qi ■ Hence, except for their order, Ci and C2 only differ in the number 
of cycles £i and £2 they contain: for k = 1 , 2 , let G Z be the difference between the 
number of £k contained in Ci and the number of £k contained in C2. Then, necessarily, 
si\£i \ + S2\£2 \ = 0 , that is 2 si -f 3s2 = 0 . This implies that si = 3 n and S2 = — 2 n for 
some n G Z. Hence 

|Ci|a - \C 2 \a = 3n|fi|a - 2n\£2\a = 6n - 2n = 4n 

This proves that 4 is a divisor of the x-period of M. Moreover, both the -cycles £1^ 
and £ 2 ^ have length equal to 6 and the numbers of occurrences of a differ exactly by 4 . 
Hence, in this case, the x-period of M is exactly 4 . □ 



In the particular case where the entries of the matrix are all linear in x, the matrix 
decomposes M = Ax B, where A and B are matrices over 5 ; this clearly happens 
when M is the a-counting matrix of some linear representation. If further M is primitive, 
the following proposition holds. 

Proposition 3. Let A and B be matrices over S and set M = Ax -f B. If M is primitive 
and A 0 B, then the x-period of M is finite. 
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Proof. Let q be an arbitrary index and consider the finite family of pairs {{nj, kj)}j^j 
such that 0 < kj < rij < m where m is the size of M and kj appears as an exponent 
in qq. Notice that since M is irreducible J is not empty. Since every cycle can be 
decomposed into elementary cycles all of which of length at most equal to m, the result 
is proved once we show that d{q) = +oo implies either kj = 0 for all j G J or kj = rij 
for all j G J: in the first case we get A = 0 while in the second case we have B = 0. 

Because of equality "3 ^ j-jjg polynomial contains 

the exponent ki Ylj^i i G J. Now, suppose by contradiction that d{q) is not 

finite. This means that all exponents in qq are equal to a unique integer h such 

that h = ki Wj^i nj for all i G J. Hence, h must be a multiple of the least common 
multiple of all products Uj. Now we have rij | i G J}-GCD{nj | j G 

jj = n^n j and by the primitivity hypothesis GCD{rij | j G J} = 1 holds. Therefore 
ft, is a multiple ofn,n, . Thus the conditions kj < rij leave the only possibilities kj = 0 
for all j G J or kj = rij for all j G J. □ 

Observe that the previous theorem cannot be extended to the case when M is irre- 
ducible or when M is a matrix over 5[a;] that cannot be written as Ax + B for some 
matrices A and B over S. 

Example 2. The matrix M with entries Mu = M 22 = 0, M 12 = x and M 21 = 1 is 
irreducible but it is not primitive since it has period 2. It is easy to see that the non-null 
entries of all its powers are monomials, thus M has infinite x-period. □ 



Example 3. Consider again Figure 1 and set M 2, 3 = x^ . Then we obtain a primitive 
matrix over B [x] that cannot be written as Ax+B and it does not have finite x-period. □ 



3.2 Properties of a;-Periodic Matrices 

Given a positive integer d, consider the cyclic group Cd = {i., g, , ■ ■ ■ , g‘^~^} of order 
d and the semiring Bj, = {'P{Cf),+, •) (which is also called B-algebra of the cyclic 
group) where P(C'd) denotes the family of all subsets of Cd and for every pair of subsets 
A, B of Cd we set = AUB and H-i? = {a-b \ a G A,b G i?}; hence 0 is the unit 

of the sum and { 1 } is the unit of the product. Now, given a positive semiring S, consider 
the map tpd '■ 5[x] — >■ Bd which associates any polynomial F = G ‘5[x] 

with the set {g^ I /fc 0} G Note that since the semiring S is positive pd is a 
semiring morphism. Intuitively, pd associates F with the set of its exponents modulo the 
integer d. Of course pd extends to the semiring of Q x Q-matrices over 5[x] by setting 
Pd{T)pq = pd{Tpq), for every matrix T : Q x Q ^ 5[x] and all p,q G Q. Observe 
that, since pd is a morphism, PdiT^pq = Pd(T'^)pq = PdiT'^pq) ■ 

Now, let M : Q x Q — 5[x] be an irreducible matrix with finite x-period d. 
Simply by the definition of d and pd, we have that for each n G N all non-empty entries 
Pd{M^)pp have cardinality 1. The following results also concern the powers of pd{M). 

Proposition 4. LetM be an irreducible matrix over S[x] with finite x-period d. Then,for 
each integer n and each pair of indices p and q, the cardinality of the subset pd{M)^ of 
Cd is not greater than 1; moreover, if Pd{AI)qq 0, then pd(M)'^^^ = {pd{M)qq)‘^ . 
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Proof. Let n be an arbitrary integer and p, q an arbitrary pair of indices. By the remarks 
above we may assume p ^ q and M'^pq ^ 0. M being irreducible, there exists an integer 
t such that M*'qp ^ 0. Note that if i? is a non-empty subset of Cd then \A- B\> | 2 l| 
holds for each A <Z Cd and Pd{MY+*^^ D pd{MY^^ ■ (pd{Mf Therefore, since 
|pd(M)"+‘pp| < 1, we have also |<Pd(-^)”pg| < 1- The second statement is proved in 
a similar way reasoning by induction on n. □ 



Proposition 5. Let M be an irreducible matrix over 5[x] with finite x-period d. Then, 
for each integer n, all non-empty diagonal elements of are equal. 

Proof. Let n be an arbitrary integer and let p, q be an arbitrary pair of indices such that 
M'^pp ^ 0 M^qq. By the previous proposition, there exist h, k such that = 

{g^} and = {(/*}. If t is defined as in the previous proof then the two elements 

■ {g’^} and {p^} • belong to since this subset contains 

only one element they must be equal and this completes the proof. □ 



Proposition 6. Let M be a primitive matrix over iS[x] with finite x-period d. There 
exists an integer 0 < 7 < c? such that for each integer n and each index q, if qq y 0, 
then MM^qq = {g^^}. 

Proof. Since M is primitive, there exists an integer t such that M’^pq y 0 for every 
n > t and for every pair of indices p and q. In particular, since dt 1 > t, we 
have \(pd{M‘^*~^^qq)\ = 1 for each q and hence there exists 0 < 7 < d such that 
= {p^}. Observe that 7 does not depend on (j, by Proposition 5. Therefore, 
by Proposition 4, we have 

+ D MM^-qq ■ MMTqq = {1} ' 

which proves the result. □ 

If M is the a-counting matrix of a linear representation, then the previous propositions 
can be interpreted by considering its state diagram. For any pair of states p, q, all paths 
of the same length starting in p and ending in q have the same number of occurrences 
of a modulo d. Moreover, if Ck is a q/^-cycle for k = 1,2 and Ci and C 2 have the same 
length, then they also have the same number of occurrences of a modulo d. Finally, if 
M is primitive, for each cycle f we have \£\a = 7 |f | modulo d for some integer 7 . 

We conclude this section with an example showing that Proposition 6 cannot be 
extended to the case when M is irreducible but not primitive. 

Example 4. Consider the a-counting matrix M associated with the state diagram of 
Figure 2. Then M is irreducible with x-period 2, but it is not primitive since also its 

period equals 2. Consider the path £ = qi — ^ q 2 — ^ qi. We have \£\ = 2 and \£\a = 1, 
hence for any 7 , 7 |f | cannot be equal to \£\a modulo 2. Thus, Proposition 6 does not 
hold in this case. 
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M = 



/01x00\ 
a: 0 0 0 0 
0 0 0 *0 
0 0 0 0 a: 
\a: 0 0 0 0/ 



Fig. 2. State diagram and matrix of Example 4 



3.3 Eigenvalues of tc-Periodic Matrices 

In this section we consider the semiring R_|_ of non-negative real numbers and we study 
the eigenvalues of primitive matrices M {x) over K+ [*] when x assumes the complex 
values 2 ; such that \z\ = 1. The next theorem shows how the eigenvalues of M{z) are 
related to the x-period of the matrix. 

Theorem 2. Let M{x) be a primitive matrix over K+[x] with finite x-period d, set 
M = M(l) and let X be the Perron- Frobenius eigenvalue of M. Then, for all z G C 
with \z\ = 1, the following conditions are equivalent: 

1. M{z) and M have the same set of moduli of eigenvalues; 

2. If X(z) is an eigenvalue of maximum modulus of M{z), then |A(z)| = A; 

3. z is a d-th root of unity in C. 

Proof (outline). Clearly condition 1) implies condition 2). To prove that condition 2) 
implies condition 3) we reason by contradiction, that is we assume that z is not a d-th 
root of unity. It is possible to prove that in this case there exists an integer n such that 
|M(z)”| < M". Therefore we can apply Proposition 1 and prove that A" is greater than 
the modulus of any eigenvalue of M(z)”. In particular we have A” > |A(z)|" which 
contradicts the hypothesis. 

Finally we show that condition 3) implies condition 1). The case d = 1 is trivial; 
thus suppose d > I and assume that z is a d— th root of unity. It suffices to prove that if 
u is an eigenvalue of M, then vz~^ is an eigenvalue of M(z) with the same multiplicity, 
where 7 is the constant introduced in Proposition 6. To this end, set T = Ivz^ — M(z) 
and T = Iv — M. We now verify that DetT = z^’” DetT holds, where m is the 
size of M. To prove this equality, recall that Det T = ’ ’ ’ ’Tmp(m)- 

By Proposition 6, since z is a d-th root of 1 in C, we have Tgg = (v — Mgg)z^ = 
z'^Tgq for each state q and fg^g^ • ■ ■ Tg^g^ • • ■ for each simple 

cycle (go 7 (?i) ■ • ■ , ds-i, do) of length s > 1. Therefore, for each permutation p, we get 
Tip(i) ■ ■ ■ Tmp(m) = zp”" ■ Tip{i) ■ ■ ■ Tmp(m) which coucludes the proof. □ 

Example 5. Let us consider again the primitive matrix of Figure 1 . We recall that here 
d = 4; moreover it is easy to see that 7 = 3 . Indeed, for each k = 1,2, we have that 
\^k\ — 3|ffc|a is equal to 0 modulo 4. Now consider the characteristic polynomial of the 
matrix M(x), given by Xx{y) = ~ y'^x^ — yx and let uhea root of xi- This implies 
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that = v'^ — — V = Q and hence —iv is a root of the polynomial Xj, —v is a 

root of the polynomial X-i and iv is a root of the polynomial X-i- This is consistent 
with Theorem 2, since 1, z, — 1 and —i are the four roots of unity. □ 



4 Local Limit Properties for Pattern Statistics 

In this section we turn again our attention to pattern statistics and study the symbol 
frequency problem in the rational stochastic model under primitivity hypothesis; our 
goal is to determine the local limit distribution of the corresponding random variable. 

Formally, given a rational formal series r : {a, b}* — >■ M+, let (^, p, rj) be a linear 
representation of r of size m. Set A = /i(a), B = /i(6) and, to avoid trivial cases, 
assume A ^ 0 ^ B. We also set M{x) = Ax + B and M = M(l). Then, consider 
the probability space of all words of length n in {a, 6}* equipped with the probability 
function given by 

PtuT = 

^ ri 

foreverym G {a, 6}”. Now, consider the random variable : {a, 6}” — >■ {0,1,... ,n} 
such that Yn{w) = |w|a for every w G {a, 6}”. For sake of brevity, we say that {¥„}„ 
counts the occurrences of a in the model defined by r. We study the asymptotic behaviour 
of Yn under the hypothesis that the matrix M (x) is primitive, obtaining a local limit 
distribution that strongly depends on the x-period of M{x). 

In the following analysis we consider triples (^, p, rj) where both ^ and rj have just 
one non-null entry which is equal to 1 : indeed, it turns out that the general case can be 
reduced to this kind of representation. To show this fact first observe that, for n large 
enough, all entries of the matrix M" are strictly positive and thus for every integer 
0 < k < n we have 



P{T„ = k} 



E 

|tt;|=n, \w\a = k 



ti(w)r] 



E 



I Cp-Tf pqVq 

1 



E 

|Lt;|=n, \w\a=k 



p(w) 






(2) 



Since M{x) is primitive, by the Perron-Frobenius Theorem, M admits exactly one 
eigenvalue A of maximum modulus, which is real and positive. Furthermore, we can 
associate to A strictly positive left and right eigenvectors u and v such that v'^u = 1 and 
we know that, as n tends to infinity we have M" ~ A" • uv’^ . Hence 



^pM pqljq ^pUpVqT]q 



Now, let YP‘^ be the random variable associated with the linear representation (cp, p, Cq), 
where Ci denotes the characteristic vector of entry i. Thus, equation (2) can be reformu- 
lated as 



m 

p{y„ = fc} = ^ Cpq ■ p{yr = k} (3) 

P,9 = l 



where Cpq are non-negative constants such that ^ Cpq = 1. 
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Theorem 3. Let r : {a, b}* — >■ M+ be a rational formal series with a linear representa- 
tion of the form (Cp, p,, Cq) such that p{a) 0 ^ p{b). Assume that its a-counting matrix 
M(x) is primitive and let d be its x-period. Also let count the occurrences of 

a in the model defined by r. Then, there exist two real constants 0 < a, f3 < I and an 
integer 0 < p < d — 1, all of them depending on M = M(l), such that as n tends to 
+(X) the relation 



P{Yr = k} = 



d 



(k — (3n)'^ 



s/2-Kan 




0 



holds uniformly for all integers 0 < k < n. 



if k = p mod d 



otherwise 



Proof For the sake of simplicity, for any integer 0 < A: < n set 



p„{k) = p{Yr = k}= Y. 



p{w)p 



\w\-=n, \w\a=k 



M" 



Observe that by Proposition 4 there exists an integer 0 < p < d such that the number of 
occurrences of a in paths of the same length starting in p and ending in q are equal to p 
modulo d. Hence, Pn{k) = 0 for each k ^ p mod d. Now, consider the smallest integer 
N such that Nd > n + 1 and apply the A^-Discrete Fourier Transform to the array 

(pn(p), p«{p + d) , Pr^{p-\-{N - l)d)) G 

where the last coefficient is null if n < p + (iV — l)(i. We get the following values fn{s), 
for integers s such that —\N/2] < s < [iV / 2J : 



JV-l 

fn(s) = Y + 

3=0 



Observe that these coefficients are related to the characteristic function Fn{9) of the 
random variable Yf^, i.e. 



Fn{0) = 

k 



E 

\w\—n 



h{w)pg 



M" 



pq 






(4) 



Indeed, for any — [ A^/2] < s < [N/2\ , we have 

/n(s) = ■ F„ 



k27vs\ 

\Nd 



Hence to obtain the values Pn{p+ jd) , j G {Ojl)-- - ,iV— l},itis sufficient to compute 
the A^-th Inverse Transform of the /„(s)’s. So, we have 



Lf J 

+ = 4 Y /r>(s) = - 



N 



s=-rfi+i 



N 



LfJ 

E p- 

-rfi+i 



\Nd) 



( 5 ) 
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To evaluate Pn{k) we now need asymptotic expressions of the function F„(0) in the 
interval 6 G ) f ] ■ To this aim observe that relations (4) and (1) imply 



Fn(e) 



M{F' 






Now, by Theorem 2, we know that the eigenvalues of M(e*^) are in modulus smaller 
than A, for each 6 G different from 0. This property allows us to argue as in [3, 

Theorem 5]. As a consequence, for each 0 G we can approximate the function 

Fn{9) with the function Fn{9) = exp {—^n9^ + i(3n9), where a and (3 are positive 
constants depending on M. Thus, we get 

Fn(9) — exp {—^n9^ + i/3n9) | = An{9) 
where, as n tends to +oo, 

fo(^) if|e|e[o,^] 

A„(9) = o if |0| G [^M 

lo(r") if| 0 |e[ 0 o,i] 

for some O<0o< 3 ’ 0 <r<l, and for every | 

Therefore, to find the approximation forp„(A:) it is sufficient to replace into (5) the 
values Fn ( ) by their approximations Fn ( ^ ) , so getting the following values for 
each k = p mod d. 
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Indeed, one can verify that | Pn{k) — Pn{k) \ — O for any 1/3 < e < 1/2 and 

every k = p mod d. Finally, the sum in (6) can be computed by using the definition of 
Riemann integral and by means of standard mathematical tools: we find fhe following 
approximation which holds as n tends to +oo, uniformly for all fc = p mod d: 



Pn{k) ^ I F„ { t] ■ e ' "" dt = 






/: 



e d'-! 






d dt = 



d 

y/2-Kan 



{k — 



□ 



Applying the theorem above to equation (3), we obtain the following result. 

Theorem 4. Let r : {a,b}* — >■ K_|_ be a rational formal series with a linear represen- 
tation (^, p, rj) such that p(a) ^ 0 p{b)- Assume that its a-counting matrix M{x) is 

primitive and let d be its x-period. Also let{Yn\n count the occurrences of a in the model 
defined by r. Then, there exist two constants 0 < a,/? < 1 depending on M = M(l) 
and d constants Cq,Ci, . . . ,Cd-i depending on M, ^ and t] such that Ci > Q for every 
i = 0, . . . ,d — 1, Ci = 1, and as n tends to +cx) the relation 



P {T„ = k} 



dClk) (k-l3np f 1 \ 

—FJA . e 2^n +0 

V 27ran \ V ”- / 



holds uniformly for all integers 0 < k < n (here {k)d = k — \ k/d\). 
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Summarizing the previous results, Theorem 3 states that if the weighted automaton 
associated with the linear representation has just one initial and one final state, then the 
local limit distrihution has a sort of periodic behaviour: it reduces to 0 everywhere in the 
domain of possible values except for an integer linear progression of period d, where it 
approaches a normal density function expanded by a factor d. We also observe that the 
main terms of mean value and variance of are given by fin and an, respectively, 
which do not depend on the initial and final slates. 

In Ihe general case, when the automaton has more than one initial or final state, by 
Theorem 4 the required limit distribution is given by a superposition of behaviours of 
the previous type, all of which have the same main terms of mean value and variance. 
In the case d = 1 the limit probability function of y„ reduces exactly to a Gaussian 
density function as already proved in [3]. Such a limit density is the same obtained in 
the classical DeMoivre-Laplace Local Limit Theorem (see for instance [7, Sec. 12]). 
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Abstract. This paper examines the learnability of a major subclass 
of E-pattern languages - also known as erasing or extended pattern lan- 
guages - in Gold’s learning model: We show that the class of terminal-free 
E-pattern languages is inferrable from positive data if the corresponding 
terminal alphabet consists of three or more letters. Consequently, the 
recently presented negative result for binary alphabets is unique. 



1 Introduction 

Pattern languages have been introduced by Angluin (cf. [1]), though some pub- 
lications by Thue can already be interpreted as investigations of pattern struc- 
tures (cf. e.g. [18]). Following Angluin, a pattern is a finite string that consists 
of variables and of terminal symbols. A word of its language is generated by 
a uniform substitution of all variables with arbitrary strings of terminals. For 
instance, the language generated by the pattern ax2 b Xi (with Xi,X 2 as vari- 
ables and a, b as terminals) contains, among others, the words wi = a a a b a, 
■u;2 = a a b b b a, W3 = a b b, whereas the following examples are not covered by 
a: vi = a, V 2 = b a b b a, W3 = b b b b b. Thus, numerous regular and nonregular 
languages can be described by patterns in a compact and “natural” way. 

Pattern languages have been the subject of several analyses within the scope 
of formal language theory, e.g. in [5] and [6] (for a survey see [16]). These exami- 
nations reveal that a definition disallowing the substitution of variables with the 
empty word - as given by Angluin - leads to a language with particular features 
being quite different from the one allowing the empty substitution (that has 
been applied when generating W 3 in our example). Languages of the latter type 
have been introduced by Shinohara (cf. [17]); they are referred to as extended, 
erasing, or E-pattern languages. 

When dealing with pattern languages, manifold questions arise from the man- 
ifest problem of computing a pattern that is common to a given set of words. 
Therefore pattern languages have been a focus of interest of algorithmic learning 
theory from the very beginning. In the most elementary learning model of induc- 
tive inference - introduced by Gold (cf. [4]) and known as learning in the limit 
or Gold style learning - a class of languages is said to be inferrable from positive 
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data if and only if there exists a computable device (the so-called learning strat- 
egy) that, for every of these (potentially infinite) languages and for every full 
enumeration of the words of the particular language to be learned, converges to 
a distinct and complete description of the language in finite time. According to 
[4] , this task is too challenging for many well-known classes of formal languages: 
All superfinite classes of languages - i.e. all classes that contain every finite and 
at least one infinite language - such as the regular, context-free and context- 
sensitive languages are not inferrable from positive data. Thus, the number of 
rich classes of languages that are known to be learnable is rather small. 

The current state of knowledge concerning the learnability of pattern lan- 
guages considerably differs when regarding standard or E-pattern languages, re- 
spectively: The learnability of the class of standard pattern languages was shown 
by Angluin when introducing its definition in 1979 (cf. [1]). In the sequel there 
has been a variety of profound studies (e.g. in [7], [19], [14], and many more) 
on the complexity of learning algorithms, consequences of different input data, 
efficient strategies for subclasses, and so on. Regarding E-pattern languages, 
however, appropriate approaches presumably need to be more sophisticated and 
therefore progress has been rather scarce. Apart from positive results for the full 
class of E-pattern languages over the trivial unary and infinite terminal alpha- 
bets in [11], the examinations in the past two decades restricted themselves to 
the learnability of subclasses (cf. [17], [11], [12], and - indirectly - [20]). In spite 
of all effort, it took more than twenty years until at least for binary terminal al- 
phabets the non-learnability of the subclass of terminal-free E-pattern languages 
(generated by patterns that only consist of variables) and, thus, of the full class 
could be proven (cf. [13]). 

In this paper we revert to the class of terminal-free E-pattern languages - 
that has been a subject of some language theoretical examinations (cf. [3] and 
[6]) as well - with a rather surprising outcome: We show that it is inferrable 
from positive data if and only if the terminal alphabet does not consist of two 
letters. Thus, we present the first class of pattern languages to be known for 
which different non-trivial alphabets imply different answers to the question of 
learnability. Using several theorems in [2] and [6], our respective reasoning is 
chiefly combinatorial; therefore it touches on some prominent topics within the 
research on word monoids and combinatorics of words. 



2 Definitions and Preliminary Results 

For standard mathematical notions and recursion-theoretic terms not defined 
explicitly in this paper we refer to [15]; for unexplained aspects of formal language 
theory, [16] may be consulted. 

For an arbitrary set A of symbols, A+ denotes the set of all non-empty words 
over A and A* the set of all (empty and non-empty) words over A. We designate 
the empty word as e. For the word that results from the n-fold concatenation 
of a letter a we write a”, j • j denotes the size of a set or the length of a word, 
respectively, and jwja the frequency of a letter a in a word w. 
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S is an alphabet of terminal symbols and X = {xi,X2,X3,- ■ ■} an infinite 
set of variable symbols, X (1 X = 0. We designate X as trivial if and only if 
jifl = 1 or |i7| = oo. Henceforth, we use lower case letters from the beginning of 
the Latin alphabet as terminal symbols; words of terminal symbols are named 
as u, V, or w. For every j, j > 1, yj G X is an unspecified variable, i.e. there may 
exist j,j' G N such that j yf j' and yj = yj>. A pattern is a word over A U A, a 
terminal-free pattern is a word over X ; naming patterns we use lower case letters 
from the beginning of the Greek alphabet, var(a) denotes the set of all variables 
of a pattern a. We write Pat for the set of all patterns and Pattf for the set of 
all terminal- free patterns. 

A substitution is a morphism cr : {X U X)* — > X* such that cr(a) = a for 
every a G A. The E-pattern language L^{a) of a pattern a is defined as the set of 
all w € X* such that a{a) = w for any substitution cr. For any word w = <j{a) 
we say that cr generates w, and for any language L = Ls{a) we say that a 
generates L. If there is no need to give emphasis to the concrete shape of X we 
denote the E-pattern language of a pattern a simply as L{a). We use ePATtf 
as an abbreviation for the full class of terminal-free E-pattern languages. For 
any class ePAT* of E-pattern languages we write ePAT^ if the corresponding 
alphabet is of interest. 

According to [10] we call a word w ambiguous (in respect of a pattern a) if and 
only if there exist at least two substitutions a and a' such that a{a) = w = cr'(a), 
but (j{xj) yf tr'(a;j) for an Xj G var(o;). The word w = aa, for instance, is 
ambiguous in respect of the pattern a = xi a X 2 since it can be generated by 
the substitutions cr, a(xi) = a, a(x 2 ) = e, and a' , cr'(a;i) = e, a{x 2 ) = a. 

Following [11], we designate a pattern a as succinct if and only if |a| < |/3| 
for all patterns f3 with T(/3) = L{a), and we call a pattern prolix if and only if 
it is not succinct. The pattern a = xixi, for instance, is succinct because there 
does not exist any shorter pattern that exactly describes its language, whereas 
j3 = X 1 X 2 X 1 X 2 is prolix since L{j3) = L{a) and |a| < \(3\. 

Let ePAT* be any set of E-pattern languages. We say that the inclusion 
problem for ePAT* is decidable if and only if there exists a computable function 
which, given two arbitrary patterns a, (3 with L{a),L{[3) G ePAT*, decides 
whether or not L{a) C L{0). In [6] it is shown that the inclusion problem for 
the full class of E-pattern languages is not decidable. Fortunately, this fact does 
not hold for terminal-free E-pattern languages. As this is of great importance 
for the following studies, we now cite two respective theorems of [6]: 

Fact 1. Let X be an alphabet, \X\ > 2, and a,f3 € X* two arbitrarily given 
terminal-free patterns. Then Lx:{P) Q Ls{a) iff there exists a morphism 4> : 
X* — > X* such that (f{a) = (3. 

Fact 2. The inclusion problem for ePATtf is decidable. 

We now introduce our notions on Gold’s learning model (cf. [4]): Each function 
t : N — X* satisfying {t{n) | n > 0} = L{a) is called a text for L{a). Let S be 
any total computable function reading initial segments of texts and returning 
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patterns. Each such function is called a strategy. If a is a pattern and t a text 
for L{a) we say that S identifies L{a) from t, if and only if the sequence of 
patterns returned by S, when reading t, converges to a pattern /3, such that 
L{f}) = L{a). Any class ePAT* of E-pattern languages is learnable (in the limit) 
if and only if there is a strategy S identifying each language L G ePAT* from 
any corresponding text. In this case we write ePAT* € LIM-TEXT for short. 

The analysis of the learnability of certain classes of languages is facilitated 
by some profound criteria given by Angluin (cf. [2]). Because of Fact 2 and since 
Pattf is recursively enumerable, we can use the following: 

Fact 3. Let Pat* be an arbitrary, recursively enumerable set of patterns and 
ePAT* the corresponding class of E-pattern languages, such that the inclusion 
problem for eF AT* is decidable. Then ePAT* G LIM-TEXT iff for every pattern 
a G Pat* there exists a set such that 

— TaC L{a), 

— Ta is finite, and 

— there does not exist a pattern /3 G Pat* with C L(/3) C L{a). 

If Ta exists, then it is called a telltale (for L{a)) (in respect of ePAT*). 

Roughly speaking, ePAT* is, thus, inferrable from positive data if and only 
if every of its languages contains a finite subset that may be interpreted (by a 
strategy) as an exclusive signal to distinguish between that distinct language 
and all of its sub-languages in ePAT*. 

We conclude this section with the seminal learnability result on ePATtf that 
has been presented in [13]: 

Fact 4. Let S be an alphabet, iXj = 2. Then ePATtf^i; LIM-TEXT. 

In [13] it is stated that the proof of this theorem cannot easily be extended on 
finite alphabets with more than two letters and it is conjectured that even the 
opposite of Fact 4 holds true for these alphabets. In the following section we 
discuss this fairly counter-intuitive assumption. 

3 On the Learnability of ePATtf 

Trivial alphabets, for which ePATtf is learnable (cf. [11]), considerably ease the 
construction of telltales. Consequently, the recent negative result on binary al- 
phabets (cf. Fact 4) -- revealing that the assumed uniqueness of the approaches 
on trivial alphabets indeed might not be a matter of the methods, but of the 
subject - promotes the guess that ePATtf should not be learnable for every non- 
trivial alphabet. This surmise is supported by the fundamental algebraic theorem 
that for the free semigroup with two generators and for every n G N there exists 
a free sub-semigroup with n generators and, thus, that the expressive power of 
words over three or more letters does not exceed that of words over two letters. 
Furthermore, there also exists a pattern specific hint backing this expectation 
since there seems to be no significant difference between terminal-free E-pattern 
languages over two and those over three letters (derived directly from Fact 1): 
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Theorem 1. Let Ei,S 2 be finite alphabets, |T’i| = 2 and 12721 > 3. Let a,j3 be 
terminal-free patterns. Then L^^ (a) ^ LsAP) iffL^,{a)^LsM- 

Thus, there is some evidence to suggest that Fact 4 might be extendable on 
all non-trivial terminal alphabets. In fact, our main result finds the opposite to 
be true: 

Theorem 2. Let 27 be a finite alphabet, |27| > 3. Then G LIM-TEXT. 

The proof of this theorem requires a broad combinatorial reasoning; it is accom- 
plished in Section 3.1. 

With Theorem 2 we can give a complete characterisation of the learnability 
of ePATtf, subject to alphabet size (for those cases not covered by Theorem 2, 
refer to [11] or Fact 4, respectively): 

Corollary 1. Let 27 be an alphabet. Then ePATtf^i: G LIM-TEXT iff \S\ 2. 

Consequently, we can state a discontinuity in the learnability of terminal-free 
E-pattern languages that - though it has been conjectured in [13] - seems to 
be rather unexpected and that might explain the lack of comprehensive results 
on this subject in the past decades. The following section is dedicated to the 
proof of Theorem 2, but a precise and language theoretical explanation of the 
demonstrated singularity of terminal-free E-pattern languages over two letters 
is still open. 



3.1 Proof of the Main Result 

The proof of Theorem 2 consists of several steps: a characterisation of prolix 
patterns, a particular type of substitution, a learnability criterion for classes of 
terminal-free E-pattern languages, and some lemmata combining these elements. 

To begin with, we give the characterisation of prolix patterns, that - although 
not implying a new decidability result (cf. Fact 2) - is a crucial instrument for 
our proof of the main theorem (see explanation after Theorem 4) as it gives a 
compact description of prolixness. Actually, in our reasoning we only use the 
if part of the following theorem, but we consider the characterisation of some 
interest since prolix terminal-free patterns may be seen as solution candidates for 
Post’s Correspondence Problem if the empty substitution is allowed (the other 
case has been analysed e.g. in [9] and [8]). 

Theorem 3. A terminal-free pattern a is prolix iff there exists a decomposition 
a = /3o 7l /^1 72 /?2 • • ■ Pn-l 7n /3n 
for an n> 1, arbitrary G X* and 7 i G i < n, such that 

1. \/i: |7i| > 2, 

2. y i,i' : var( 7 i) (7 var(/3j/) = 0, 

3. V t 3 j/i G var( 7 i) : = 1 A V < n : (?/i G var( 7 */) 7 * = 7 */)). 
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Proof. We first prove the if part of the theorem. Hence, let a G Pattf be a 
pattern such that there exists a decomposition satisfying conditions 1, 2, and 3. 
We show that then there exist a pattern S G Pattf smd two morphisms 4> and if 
with |i5| < |a|, 4>{a) = 5, and if{5) = a. Thus, we use Fact 1 as a criterion for 
the equivalence of E-pattern languages. 

We define <5 := /3 q J/i /3i 2/2 /?2 • ■ • Pn-i Un fin with yi derived from condition 3 
for every i < n. Then condition 1 implies |i5| < |o;|; the existence of <f smd if (<f 
mapping yi on 7^ and if mapping 7^ on yi for every i < n, both of the morphisms 
leaving all other variables unchanged) results from conditions 2 and 3. 

Due to space constraints and as it is not needed for our subsequent reasoning, 
the proof of the only if part is merely given as an extended sketch. 

Assume that a G Pattf is prolix. We show that then there exists at least one 
decomposition of a satisfying conditions 1, 2, and 3: Because of the assumption 
and Fact 1, there exist a succinct pattern 8 G Pattf and two morphisms <f and 
if with |(i| < |a|, (f{a) = 6, and if{S) = a. Since S is succinct it is obvious that 
\'4’i^j)\ ^ 1 for every Xj G var(<5). Moreover, we may conclude that for every 
Xj G var(i5) there exists an Xj> G var(if(xj)) such that = \a\x^, as otherwise 
S would be prolix - according to the if part and because of (f{a) = 6, leading 
to Xj G var{(f{xjn)) for some Xj" G var(a). Therefore the following fact (later 
referred to as (*)) is evident: Without loss of generality, 6, <f, and if can be 
chosen such that Xj G varfiffxj)) for every Xj G var(<5), <f{xj) = xj for every 
Xj G var(a) fl var(<i), and 4>{xj') = e for every Xj' G var(a) \ var(<i). 

In order to provide a basic decomposition of a we now define some appro- 
priate subsets of var(a): First, Yi := G var(a) | \if{(f{xjf))\ > 2}, second, 
Y 2 := {xj.^ G var(o;) | (ffxj^) = e}, and finally Y3 := var(a) \ (Yi U Y 2 ). These 
definitions entail Yi fl Y2 = 0, Y2 0 (because of |<5| < |a|), and \<f{xj^)\ = 
\if{(f{xj^)) \ = 1 for all Xj^ G Y3 (because of (*)). Using these sets of variables we 
examine the following decomposition: a = /3 q 7i /3i 72 /32 ■ • ■ Pm-i Im Pm with 
Po, Pm G Yg*, Pi G Yg^ for 0 < t < m, and 7^ G (Yi U Y2)+ for all i < m. 

This decomposition is unique. Obviously, it satisfies condition 2, and because 
of (*) we may state fact (**): 7* = if{(f{'ji)) for every i, 1 < i < m. 

This leads to var(7^) fl Yi yf 0 for all i < m, and therefore condition 1 is 
satisfied. Now we can identify the following two cases: 

Case A: Vi, 1 < z < TO : = 1 

Consequently, if var(7j)nvar(7j/)nYi 0 for some z, z', z yf i' , then (ff'^f) = (ff'ji') 
and also iffpf'fi)) = if{(f{ji>)). Thus, with (**) we can state ji = 7^', and 
therefore condition 3 for the basic decomposition is satisfied. 

Case B: 3z, 1 < z < TO : l7zU,i = pyffforapGN 

Because of condition 1 being satisfied, we can assume in this case that p > 2. 
Hence, we examine all modified decompositions of a that match the following 
principle for every % meeting the requirement of Case B: 



7t 



a = /3o 7l /3l 72 /?2 • • ■ Pi-l Hi Pii 7*2 Ph ■■■ Pip-l Up A+l • ■ • Pm-l Im Pm 
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such that Pi for all i and 7^ for alH yf i derive from the previous definition, and, 
furthermore, such that = e for all A:, 1 < fc < p— 1, and 7?^, = 75^,^/ 
for all A:, 1 < fc < p, with |7 j^ | > 2 and 7?;,,;, 7ij,,r G Y-2 ^nd G Yi. Then, for 
all of these newly created decompositions, conditions 1 and 2 still are satisfied. 
Because of = 7,, one of these decompositions meets the 

requirement of Case A. □ 

As an illustration of Theorem 3 we now analyse some terminal-free patterns: 

Example 1. X1X2X2X1X2X2 is prolix since 71 = 72 = X1X2X2 and /3 q = A = 
P2 = e. X1X2X2X1X2X2X2 and XiX2XiX^XjiX2X4^x^ are succinct since no variable 
for every of its occurrences has the same “environment” (i.e. a suitable 7) of 
length greater or equal 2 such that this environment does not share any of its 
variables with any potential p. xiX2XiX2Xzx^X2XiXiX^xzX2XAXj^x^ is prolix since 
7 l = 72 = X1X2, 73 = 74 = X2XiXAXii, Po = Pi= P4^ = e, P2 = X3X3, P3 = X3. 

As pointed out in [13], certain words due to their ambiguity are unsuitable 
for being part of a telltale. In the following definition we introduce a particular 
type of substitution that - depending on the pattern it is applied to - may lead 
to ambiguous words as well; nevertheless, it can be used to generate telltale 
words as it imposes appropriate restrictions upon their ambiguity. This feature 
is relevant for the learnability criterion in Theorem 4. 

Definition 1. Let a he a terminal-free pattern, jaj =: n, and a a substitution. 
For any m <n let a\m = yiV2 ■ ■ ■ Vm be the prefix of length m of a. Let r\, r2, 

. . . , r„_i and I2, I3, ■ ■ ■ , In be the smallest natural numbers such that for every 
substitution a' with a' (a) = a{a) and for m = 1, 2, . . . , n — 1; 

\a{a\m)\-rjn < |cr'(Q;\m)| < |CT(a\,„)| -I- 

Furthermore, define li := 0 =: r„. 

Then we call the substitution a {X, p)- significant (for a) iff there exist two 
mappings A, p : N — > N such that, for every xj € va,r(a), X(j) = ma,x{lm \ Vm = 
Xj}, p{j) = max{rm | ym = Xj}, and |cr(a;j)| > A(j) -I- p{j) 1. We designate 
a word w as significant (for a) iff for some X,p there exists a {X, p) -significant 
substitution a such that w = a{a). 

The following example illustrates Definition 1: 

Example 2. Let a := xiX2X3X4XiX4X3X2 - Obviously, a is terminal-free and prolix 
(cf. Theorem 3). Let the substitution a be given by cr(a;i) := a, a{x2) ■= ab, 
a{x3) := b, and a{x4) := a a. With little effort it can be seen that there exists 
only one different substitution a' such that a' (a) = cr{a), namely a'{xi) = aa, 
<x'{x2) = b, a'{x3) = ba, and (j'{x4) = a. In terms of Definition I this implies 
the following: 

cr(a\i) == a cr(a\2) = aab cr(a;\3) = aabb cr(a;\4) =aabbaa ••• 

cr'(a;\i) = aa (t'(q;\ 2) = aab cr'(a\3) = aabba tr'(a\4) = aabbaa ••• 
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Thus, h = h = le = h = 1> h = h = h = h = 0, and = 0 for 1 < A; < 8 . 

Then, with A(l) = A(3) = 0, A(2) = A(4) = 1, and p{j) = 0 for 1 < j < 4, the 
substitution a is (A, p)-significant for a, since \a{xj)\ > A(j)+/9(j) + l for every j, 
1 < J < 4. Consequently, there are certain subwords in w := a{a) = a'{a) that 
are generated for every possible substitution by the same variable; therefore we 
may regard the following variables and subwords - that, in a different example, 
of course can consist of more than one letter each - as “associated” : 



w = a 

Xi X2 



X3 X4 



Xi X4 



X3 X2 



That is the particular property of significant words which serves our purposes. 

A second (and “comprehensive”) example for a substitution generating sig- 
nificant words is given in Lemma 1. 

Now we present the learnability criterion to be used, that is a generalisation 
of two criteria in [13]. As mentioned in Example 2, this criterion utilizes the 
existence of certain subwords in significant words that may be mapped to a 
distinct variable. In these subwords we place a single letter as a marker for its 
variable such that the exact shape of the generating pattern can be extracted 
from a suitable set of these words. The need for this distinct marker is an oblique 
consequence of a method used in [13] - the so-called inverse substitution. 

Theorem 4. Let S he an alphabet. Let Pat^f be a recursively enumerable set 
of terminal-free patterns and ePAT*f ^ the corresponding class of E-pattern lan- 
guages. Then ePAT^f ^ G LIM-TEXT if for every a € Patjf there exists a finite 
set SUB := {( 7 ^ | 1 < i < n} of substitutions and mappings Aj and pi such that 

1. every Gi G SUB is {Xi, pf) -significant for a and 

2. for every Xj G var(a) there exists a substitution Uj' G SUB with aj>{xj) = 
Uj'j a Vjij for a letter a G A and some Uj>j,Vjij G S* , \uj>j\ > Xj'{j) 
and \vj>j\ > Pj'{j), such that |CTj'(a;)|a= \ct\xj- 

Proof. Given a G Pat*£, we define a set of words by Tq, := {wi \ ai{a) = 
Wi for a CTi G SUB}. We now show that Tq, is a telltale for L{a) in respect of 
ePAT*£ For that purpose assume T„ C L{f3) C L{a) for some /? G Pat*£. Then 
(due to Fact 1) there exists a morphism (j) : X* — >■ X* such that (j){a) = (3. 

Because every G SUB is (Ai, pi)-significant for a and because of condition 2 
we may conclude that for every Xj G var(a) and for every a' with cr'(/3) = 
Wj> = Gj'{a) - that necessarily exists since Tq C L{(3) - holds the following: 
u'{(j){xj)) = ub, j a Vj, j for a letter a G A, and two words ub, j,vb, j G A*; for 
these words, due to the significance of Wj>, it is evident that ub, j is a suffix of 
Ujpj (or vice versa) and vb, j is a prefix of Vj>j (or vice versa). In addition it 
is obvious that |cr'(/3)|a = \a\xj can be stated for the examined a' . Therefore, 
in order to allow appropriate substitutions to generate the single letter a, (j) 
for all Xj G var(o;) must have the shape 4>{xj) = 71 xj^ 72 with 71,72 G X* 
and a single variable Xj^ G vax{(p{xj)), i.e. \(3\xj^ = \oc\xj- Hence, the morphism 
Ip : X* — > X* , defined by ip{xk) ■= xj for k = ja and ip{xk) '.= e for k ^ ja, 
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leads to '!/'(/?) = and - with the assumption L{f3) C L{a) - to L{(i) = L{a). 
Consequently, L{f3) ^ L{a) for every pattern j3 with Tq, C L{f}) and, thus, is 
a telltale for L{a) in respect of ePAT^f (because of Fact 3). □ 

In prolix patterns, there exist variables that cannot be substituted in such 
a way that the resulting word is significant. For instance, in the pattern a = 
Xi X 2 Xi X 2 every subword generated by Xi can be generated by X 2 as well, and 
vice versa. Therefore, when applying Theorem 4 to ePATtf, Pat*f necessarily has 
to consist of succinct patterns only. 

The following two lemmata prove that for every succinct terminal-free pattern 
(and, thereby, for every language in ePATtf) there exists a set of substitutions 
satisfying the conditions of Theorem 4. 

Lemma 1. Let a he a succinct pattern, a G Pattf, and E an alphabet such that 
{a, b, c} C E. Let for every Xj G var(a) and for every i € {j \ Xj G var(o;)} the 
substitution af be given by 

tf, , J a a ab^-^ a , ij^j, 

■ “1^ ab^^“^ a c ab^^“^ a ab^-1 a , i = j, 

Then for every a[ with cr'(a) = af(a), for every xj G var(a), and for some 

Uj , Vj G E* : 



\ f a ab^-^ ^ a a ti, , i ^ j, 

[Uj SL c ab'^J a a Vj , * = J, 

Proof. To begin with we explain the following terms that are used frequently: A 
segment of af{xj) is a subword ab^-^“^a, 0 < g < 2. As the natural extension 
thereof, the term segment of cr*^(i5) for every 5 G X* designates any segment of 
af{xj) with Xj G var(d). An outer segment ofaf{xj) is the subword ab^-l“^a 
or the subword ab^^ a. The inner segment of al^(xj) is the subword ab^^“^ a. 
a[{xji) contains segments of af{xj) means that the segments of af(xj) occur in 
natural order (i.e. in that order specified by cr*^), consecutively (apart from the 
potential necessity of inserting the letter c), and non-overlapping. 

Let cr' be an arbitrary substitution with (j'(a) = erf (a) and cr'fxj) ^ af{xj) 
for an Xj G var(a). Then we define the following subsets of var(a): Let Y\ 
be the set of all Xj^ G var(a) such that a'fxjf) contains the inner segment of 
af(xj^), of every outer segment at least one letter, and at least one segment of 
the substitution af of a neighbouring variable. Consequently, cr'(xjy) contains 
at least two segments of af{xj,^), and for all Xj^^ G Yi: a <5i Xj,,Xj^ 62 with 
di,S 2 G X* . Let Y 2 be the set of all Xj.^ G var(o!) such that er'fxjf) contains of 
at least one segment of crf{xj 2 ) no letter. Then n Y 2 = 0- Finally, let Y 3 be 
given by Y 3 := var(a) \ (Yi U Y 2 ). Then cr'fxjf) for all G Y 3 contains the 
inner segment of (rf{xjf) and of both outer segments at least one letter, but no 
complete segment of a neighbouring variable. 

Now assume to the contrary Y 2 0, that implies Yi y^ 0 as for every variable 
there are three unique corresponding segments (for two segments, depending on 
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a and ct', we might face the situation that I2, Y3 ^ 0, but Yi =0). We show that 
this assumption entails a being prolix. The subsequent argumentation on this 
utilizes Theorem 3 and an evident fact (referred to as (*)): For every <5 G , 
a'(S) contains of at least one segment of afiS) no letter. 

As the starting point of our reasoning we use the following decomposition: 
a = /?o 7i A 72 /?2 ■ • • Pn-i 7n Pn with n > 1, Po,Pn G Yg*, Pk G for 
0 < k < n, and jk G (Yl U Y2}~*' for 1 < fc < n. This decomposition is unique, 
and it obviously satisfies condition 2 and - due to (*) - condition 1 of Theorem 3. 
However, concerning condition 3 it possibly deserves some modifications. To this 
end, the following procedure reconstructs the above decomposition such that in 
every 7^ there is exactly one occurrence of a variable in Yi (using (*) again): 

PROCEDURE : 

Define k := 1. 

STEP 1: Let 2/1, j/2 be the leftmost variables in 7^, with y 1,1/2 G Yi and 7^ = 
^1 2/1 2/2 S2 for $1,62 G X*. IF these variables exist, THEN define 7«+i := 
'Jn, ,'~fk+2 ■ — '~/k+\, Pn+1 • — Pn, , Pk+1 ■ — Pk, '~/k • — Vl, Pk — C, 
and 7fe+i := 2/2 <^2; finally, define n := n + 1. END IF. IF A: < n, THEN define 
k := A: + I and go to STEP I. ELSE rename all pattern fragments as follows: 
a =■■ /3oi 7ii Ph 72i P21 ■■■ P(n-i)i 7m Pni- Finally, define ki := li and go to 
STEP 2. END IF. 

STEP 2: Let yi G Y2 and 2/2 G Yi be the leftmost variables in 7^^ with 
7fci = 2/1 2/2 ^2 for 61,62 G X*, such that at least one segment of erf (2/1) is 

generated by variables in (5i. IF these variables exist, THEN IF <j[{y2) contains a 
segment of <7^(62) (thus, necessarily the leftmost segment), THEN extend the de- 
composition as described in STEP 1. Finally, define ki := (A:-|-l)i, ni := (n-l-l)i, 
and go to STEP 2. ELSE define Yi := Yi \ {2/2} and Y3 := Y3 U {2/2}, reconstruct 
all pattern fragments Pk-, and jki accordingly and go to STEP 2. END IF. ELSE 
IF A:i < Til, THEN define ki := (k + I)i and go to STEP 2. ELSE rename all 
pattern fragments as follows: a =: Pq^ 712 Pi^ 722 P22 ■ ■ ■ P(n-i)2 7ri2 Pri2 ■ Finally, 
define k2 ■= I2 and go to STEP 3. END IF. END IF. 

STEP 3: Let 2/1 G Y\ and 2/2 G Y2 be the leftmost variables in 7^^ with 
7fc2 = J/i 2/2 ^2 for 61,62 G X* , such that at least one segment of erf (2/2) 
is generated by variables in ^2. Modify 7fc2 analogously to Step 2. When this 
has been done for every k2, then rename all pattern fragments as follows: 
o =: P03 713 P13 723 P23 ■■■ P(n-i)3 lri3 Pu3- Define k^, := I3 and go to STEP 4. 
STEP 4: Let 2/1 , 2/2 be the leftmost variables in 7^3 with 2/1 7 2/2 G Y2 and 7^3 = 
^1 2/12/2 62 for 61,62 G X*, such that at least one segment of erf (2/1) is contained in 
cr'((5i) and at least one segment of erf (2/2) is contained in ct'( 52). IF these variables 
exist, THEN extend the decomposition as described in STEP 1. Finally, define 
k^ := {k + 1)3, ri3 := (n -b 1)3, and go to STEP 4. END IF. IF A:3 < 713, THEN 
define k^ := {k + 1)3 and go to STEP 4. ELSE rename all pattern fragments as 
follows: a =: Pq^ 71^ Pu 724 P2^ ■■■ P(n-i)i Iru Pm- END IF. 

END OF PROCEDURE 

Obviously, V A;4,A:4 : var(yfe^) (7 var(/3fe/ ) = 0. So the following is left to be 
shown: 
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a) V /C 4 : 17*4! > 2 

b) V/C4 : = 1 

c) V /C4, ^4 : (var(7fej n var(7fc/ ) n Fi 0 7^^ = 7^^) 

ad a) As a consequence of (*), the first decomposition of a is modified by Steps 1 
- 4 if and only if there exists a 7^ that contains at least two variables t/i, 1/2 G h^i- 
The procedure splits this 7^ in such a way that all new 7^^, 7^^, 7*3, and 7^^ 
contain at least one Xj^ G Y2 and a sequence of variables that generates a segment 
of crfixj^) by cr'. Thus, \^ki \ > 2 for all k^. 

ad b) As mentioned in a), every jk contains at least one variable xj^ € I2. 
Moreover, there must also be a variable Xj^ G Yi in every jk (again due to (*)). 
The procedure splits only those jk that contain at least two variables of Yi, 
and obviously - if possible - in such a way that every jk4 contains exactly one 
Xjj^ G Yi- If this due to a) is not possible (e.g. for 7^, = yi j/2 2/3 with 2/1, 2/3 G Yi, 
2/2 G Y2, and necessarily yi ^ 7/3), then in Step 2 or Step 3 either y\ or 7/3 is 
removed from Yi and therefore it is removed from 7^^ or 7^,3 , respectively, 
ad c) Because of a) and b) it is obvious that every 7^,^ begins or ends with a 
variable from Y2. We consider that case where 7^^ begins with Xj^^k^ G Y2; the 
second case is symmetrical and the argumentation for the case that both aspects 
hold true derives from the combination of both approaches. 

Hence, without loss of generality 7^^ = Xj^^ki for i5i G X* . Due to (*) 
and the construction of 7^4, to the right of Xj^^k^ there is a pattern fragment 
7fc4,i) l7fc4,i| ^ 1) such that <j[{'^ki,i) contains at least one segment of erf {xj^^ki)- 
If 7fc4,i G Y2 , then to the right of this pattern fragment there is a second 
pattern fragment ^ki,2, I7fc4,2| > 1, that - with cr' - again generates the segments 
“missing” in cr' (7fc^_i) and so on. As every 7^^ has finite length, there must exist 
a Xj^^ki G Yi in 7^^ concluding this argumentation. 

Consequently, = Xj^^u^ 7*4.1 7*4,2 ••• 7*4 .p-i 7*4, p for a p G N. 

However, since all variables in 7^,4 1 < / < p of at least one of their segments 
do not generate any letter, Xj^^k^ by ct' exactly determines the shape of ^ki,p, 
^k4,p that of 7fc4,p-i etc. up to 7^4,1 determining Xj2,k4- 

This holds true for every occurrence of Xj^^k4 and therefore 'i Xj^ G 

Yi : ((xj 4 G var(7fc4) A xj^ G var(7fc' )) ^ 7^,4 = jk'f- This proves c). 

Thus, with a), b), c), and Theorem 3, a is prolix. This is a contradiction. 
Consequently, we may conclude Yi = Y2 = 0; this proves the lemma. □ 

With Lemma 1, the major part of our reasoning is accomplished. Now the 
following lemma can be concluded without effort: 

Lemma 2. Let a he a succinct pattern, a G Pattf. Then for every i, i € {j \ 
Xj G var(a)}, there exist mappings Xi,pi : N — /■ N such that af (cf. Lemma 1) 
is (Ai, Pi) -significant for a. 

Proof. Directly from Lemma 1, since for every Xj G var(a) we can state Ai(j) < 
3j - 1, PiU) < 3j + 1, and | af{xj)\ > 9 j + 3. □ 
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Consequently, for every succinct terminal-free pattern there exists a set of 
significant words. However, no word generated by any needs to consist of 
three different letters in order to be significant - a and b would be sufficient. 
Indeed, due to the marker c, the given set of all af satisfies the second condition 
of Theorem 4. Thus, Theorem 4 is applicable for ePATtf and therefore the main 
result of this paper, given in Theorem 2, is proven. 
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Abstract. We present two simple algorithms for SAT and prove upper 
bounds on their running time. Given a Boolean formula F in conjunctive 
normal form, the first algorithm finds a satisfying assignment for F (if 
any) by repeating the following: Choose an assignment A at random 
and search for a satisfying assignment inside a Hamming ball around 
A (the radius of the ball depends on F). We show that this algorithm 
solves SAT with a small probability of error in at most steps, 

where n is the number of variables in F. To derandomize this algorithm, 
we use covering codes instead of random assignments. The deterministic 
algorithm solves SAT in at most " steps. To the best of our 

knowledge, this is the first non-trivial bound for a deterministic SAT 
algorithm with no restriction on clause length. 



1 Introduction 

The propositional satisfiability problem (SAT) can be solved by an obvious al- 
gorithm in 2” steps where n is the number of variables in the input formula. 
During the past decade there was a significant progress in proving better upper 
bounds for the restricted version of SAT (known as fc-SAT) that allows clauses of 
length at most k. Both deterministic and randomized algorithms were developed 
for fc-SAT; the currently best known bounds are as follows: 

— poly{n) (2 — for a deterministic fc-SAT algorithm [4,3]; 

— poly{n) for a randomized /c-SAT algorithm, where A: > 4 and 

^ as A: — >■ oo [8]; 

— 0(1.324") for a randomized 3-SAT algorithm and 0(1.474") for a random- 
ized 4-SAT algorithm [7]; these bounds and other recent bounds for 3-SAT, 
e.g., [1,6,11], are based on Schoning’s local search algorithm [12,13] or on the 
randomized DPLL approach of Paturi, Pudlak, Saks, and Zane [9,8]. 

* Supported in part by RAS program of fundamental research “Research in principal 
areas of contemporary mathematics”, RFBR grant ^02-01-00089, and by Award 
No. RM1-2409-ST-02 of the U.S. Civilian Research & Development Foundation for 
the Independent States of the Former Soviet Union (CRDF). 
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However, the progress for SAT without the restriction on the clause length 
is much more modest. Pudlak gives a randomized algorithm (based on [8]) that 
solves SAT in expected time poZ?/(n) m where n is the number of vari- 

ables, m is the number of clauses, and £ is a positive constant [10]. The most 
recent bound for a randomized SAT algorithm is given by Schuler in [14]: his al- 
gorithm (using the algorithm of [8]) runs in expected time poly{n) m 2"~ i+i°g 2 ™ . 
There are also bounds that are “more” dependent on the number of clauses or 
other input parameters, e.g., poZy(n) to 2°-^°®®^™ [5] for a deterministic SAT al- 
gorithm. 

In this paper, we give a randomized algorithm that solves SAT in expected 
time poly{n) rn? and a deterministic algorithm that solves SAT in 

time poly{n) ", Xo the best of our knowledge, the latter is the first 

non-trivial bound for a deterministic SAT algorithm with no restriction on clause 
length. The bound for the randomized algorithm is worse than Schuler’s bound 
[14]. However, our randomized algorithm uses another idea (the approach of [3] 
based on covering the search space by Hamming balls) and has a derandomized 
version (our deterministic algorithm). 

Both our algorithms are based on the multistart local search approach that 
proved to be successful in randomized and deterministic algorithms for fc-SAT 
[13,3]. Similarly to other local search algorithms, our algorithms choose some as- 
signment of truth values to variables and then modify it step by step; sometimes 
the algorithm is restarted. There are two versions of this approach: “random- 
ized” search [13] where the algorithm performs a random walk and “determin- 
istic” search [3] where the algorithm recursively examines several possibilities 
to change the current assignments. In both versions, the random walk or the 
recursion is terminated after a specified number of steps, and the algorithm is 
restarted. We use the “deterministic” approach [3] for both deterministic and 
randomized algorithms: they search for a satisfying assignment inside a Ham- 
ming ball of a certain radius R around the initial assignment. More exactly, the 
search implementation either uses a minor modification of the procedure in [3] 
or examines all assignments in the Hamming ball, whichever is faster. 

The analysis of a randomized algorithm based on the multistart local search 
usually contains two parts: the estimation of the probability that the initial 
assignment is close enough to a satisfying assignment, and the estimation of the 
time needed to perform the search started from the initial assignment. In the 
analysis of a deterministic algorithm based on the same approach, the first part 
is replaced by the estimation of the number of initial assignments that are needed 
to guarantee that all 2” assignments (the points of the Boolean cube {0, 1}") 
are covered by Hamming balls of radius R around the initial assignments^. In 
both cases, R is chosen to tradeoff between the number of initial assignments 

^ For example, the paper [3] gives two constructions of such coverings; we use the one 
that finds the set of assignments for n/6 variables by a greedy algorithm for the Set 
Cover problem, and then takes the direct product of 6 instances of the constructed 
set. The construction is optimal both in time and the number of assignments; how- 
ever, the algorithm uses exponential space. 
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and the running time inside each ball. Our analysis follows this general scheme 
and, in addition, takes into account the fact that the time needed to find a 
solution inside a ball varies from one initial assignment to another. Our key 
lemma (Lemma 5) estimates the probability that this time is small enough, i.e., 
the lengths of clauses used by the algorithm are bounded by a certain function 
of n. 

Organization of the paper. Sect. 2 defines basic notions and notation used in 
the paper. The randomized algorithm and its analysis are given in Sect. 3. This 
algorithm is derandomized in Sect. 4. 

2 Definitions and Notation 

Formulas and assignments. We deal with Boolean formulas in conjunctive nor- 
mal form (CNF). By a variable we mean a Boolean variable that takes truth 
values T (true) or T (false). A literal is a variable x or its negation -^x. If Hs a 
literal then -•I denotes the opposite literal, i.e., if / is a; then -•I denotes -ix, and 
if I is —'X then -■? denotes x. Similarly, if v denotes one of the truth values T or 
T, we write ~<v to denote the opposite truth value. A clause is a disjunction C 
of literals such that C contains no opposite literals. The length of C (denoted 
by IC'D is the number of literals in C. A formula is a conjunction of clauses. 

An assignment to variables Xi,...,a;„ is a mapping from {xi, . . . , cc„} to 
{T,T}. This mapping is extended to literals: each literal ~^Xi is mapped to the 
truth value opposite to the value assigned to Xi. We say that a clause C is satisfied 
by an assignment A if A assigns T to at least one literal in C. Otherwise, we say 
that C is falsified by A. The formula F is satisfied by A if every clause in F is 
satisfied by A. In this case, A is called a satisfying assignment for F. 

Let F be a formula and I be a literal such that its variable occurs in F. We 
write F\i^t to denote the formula obtained from F by assigning the value T 
to 1. This formula is obtained from F as follows: the clauses that contain I are 
deleted from F, and the literal -'I is deleted from the other clauses. Note that 
F\i^j- may contain the empty clause or may be the empty formula. Let A and 
A' be two assignments differ only in the values assigned to a literal 1. Then we 
say that A' is obtained from A by flipping the value of 1. 

Covering by balls. We identify T and T with 1 and 0 respectively. Then any 
assignment to variables xi,...,x„ can be identified with a point in Boolean 
cube {0, 1}”. Let A and A' be assignments to xi, . . . ,x„, i.e.. A, A' G {0, 1}”. 
The Flamming distance between A and A' is the number of variables Xi such 
that A and A' assign different values to Xj, i.e., the number of coordinates where 
A and A' are different. The Flamming ball (or simply ball) of radius R around 
an assignment A is the set of all assignments whose Hamming distance to A is 
less than or equal to R. The assignment A is called the center of the ball. The 
volume of a ball is the number of assignments that belong to the ball. We write 
V{n, R) to denote the volume of a ball of radius R in {0, 1}”. It is well known 
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that the volume of a Hamming ball can be estimated in terms of the binary 
entropy function: 



H{x) = -X log 2 X - (1 - x) log 2 (l - x) . 

Let Ai, . . . ,At £ {0, 1}”. Consider the balls of radius i? around Hi, , Af. 
We say that these balls cover {0, 1}" if any point in {0, 1}" belongs to at least 
one of these balls. The centers of the balls that cover {0,1}" are then called a 
covering code of length n and radius R, see e.g., [2]. The number t of the code 
words is called the size of the covering code. 

Notation. Here is a summary of the notation used in the paper. 

— F denotes a formula; n denotes the number of variables in F; m denotes the 
number of clauses in F] k denotes the maximum length of clauses in F ; 

— C denotes a clause; \C\ denotes its length; 

— A denotes an assignment; 

— F\i=t denotes the formula obtained from A by assigning T to literal l\ 

— R denotes the radius of a ball; V (n, R) denotes the volume of a ball of radius 
Rin {0,1}"; 

— H{x) denotes the binary entropy function. 



3 Randomized Algorithm 

In this section we desribe our randomized algorithm for SAT and analyze its 
probability of error and running time. The algorithm is called Random-Balls, it 
invokes procedures called Ball- Checking and Full- Ball- Checking. We start with 
the definition of these procedures. Given a formula F, an assignment A, and a 
radius R, each of the procedures searches for a satisfying solution to F in the 
Hamming ball of radius R around A. 



Procedure Ball-Checking{F, A, R) 

Input: formula F, assignment A, number R. 

Output: satisfying assignment or “no”. 

1. If all clauses in F are true under A then return A. 

2. If i? < 0 then return “no”. 

3. If A contains an empty clause then return “no”. 

4. Choose a shortest clause h V . . . V in F that is falsified by A. 

5. For i ^ 1 to fc 

Invoke Ball- Checking {Fi, Ai, R — 1) where Ft is and A is obtained 

from A by flipping the value of h. If this call returns an assignment S, return 
S. 

6. Return “no”. 
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This procedure differs from its counterpart in [3] only in the choice of an 
unsatisfied clause at step 4: the procedure above chooses a shortest unsatisfied 
clause, while [3] allows choosing any unsatisfied clause. 

Lemma 1. If Ball -Checking {F, A, R) returns an assignment then this assign- 
ment satisfies F and belongs to the Hamming hall of radius R around A. If 
B all- Checking {F, A, R) returns “no” then F has no satisfying assignments in 
the ball of radius R around A. 

Proof. The same as the proof of Lemma 2 in [3] . □ 

The following lemma gives a natural upper bound on the worst-case running 
time of Procedure Ball- Checking. 

Lemma 2. The running time of B all- Checking {F, A, R) is at most 

polyfn) mk^ , where k is the maximum length of clauses occurring at steps 4 in 
all recursive calls. 

Proof. The recursion tree has at most k^ leaves because the maximum degree 
of branching is k and the maximum depth is R. □ 

The next procedure Full- Ball- Checking searches a satisfying solution in a 
ball using a “less intelligent” method: this procedure simply checks the input 
formula on all points of the ball. 



Procedure Full- Ball- Checking {F^ A, R) 

Input: formula F over variables xi, . . . , assignment A, number R. 

Output: satisfying assignment or “no”. 

1. For j ^ 0 to i? 

For all subsets {ii, . . . ,ij} C {1, . . . , n} 

a) Flip the values of variables , . . . , Xi^ in A. Let A! be the new 
assignment obtained from A by these flips. 

b) If Af satisfies F ^ return Af . 

2. Return “no”. 

Clearly, Full- Ball- Checking runs in time at most poly{n)mV{n,R). 

Next we define Algorithm Random- Balls. Given a formula F, this algorithm 
either returns a satisfying assignment for F or replies that F is unsatisfiable. 
In addition to F, the algorithm takes two numbers as input: R (radius of balls) 
and I (“threshold length” of clauses). The algorithm generates a certain num- 
ber of random assignments step by step. For each such assignment A, the al- 
gorithm searches for a satisfying solution in the ball of radius R around A. 
To do it, the algorithm invokes either Procedure Ball-Checking or Procedure 
Full- Ball- Checking . The first one is executed if all clauses that would occur at 
its steps 4 are shorter than the specified “threshold” 1. Otherwise, the algorithm 
invokes Procedure Full- Ball- Checking. 
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Algorithm Random- Balls {F, R,l) 

Input: formula F over n variables, numbers R and I such that 0 < R < I < n. 
Output: satisfying assignment or “no”. 

1. iV= \y/8R{l - R/n) 

2. Repeat N times the following: 

a) Choose an assignment A uniformly at random. 

b) If F contains a clause that has at least I literals falsified by A and at most 
R literals satisfied by A, invoke Full-Ball-Checking{F, A, R) . Otherwise 
invoke Ball- Checking {F, A, R). If the invoked procedure finds a satisfying 
assignment, return it. 

3. Return “no”. 

Obviously, if the algorithm Random-Balls returns an assignment S then S 
satisfies the input formula, but the answer “no” may be incorrect. Thus, the 
algorithm is a one-sided error Monte Carlo algorithm that makes no mistake on 
unsatisfiable formulas, but may err on satisfiable ones. The following theorem 
estimates its probability of error. 

Lemma 3. For any R and I, the following holds: 

1. If an input formula F is unsatisfiable then Algorithm Random-Balls returns 
“no” with probability 1. 

2. If F is satisfiable then Algorithm Random-Balls finds a satisfying assignment 
with probability at least 1/2. 

Proof. The first part follows from Lemma 1. Consider the second part: F has a 
satisfying assignment S, but all N trials of the algorithm return “no”. This is 
possible only if for each of the N random assignments (chosen at step 2a), its 
Hamming distance from S is greater than R. Therefore, the probability of error 
does not exceed (1 — p)^ where p is the probability that a random assignment 
belongs to the Hamming ball of radius R around S. To estimate p, we observe 
that p = V{n, R) /2” where V (n, R) is the volume of a Hamming ball of radius R 
in the Boolean cube {0, 1}". For R < nj2, the volume V (n, R) can be estimated 
as follows, see e.g. [2, Lemma 2.4.4]: 

\ . ‘)H{R/n)n ^ ^ r^H(R/n)n 

^/8R{1 - R/n) - \ . j - 

Therefore p > 2Afi(R/n)-i) j ^J^Rfi^RRRJfifi) . Using this lower bound on p, we get 
the stated upper bound on the probability of error: (1 — p)^ < <1/2. □ 

The following lemma is needed to estimate the running time of the algorithm 
Random-Balls. 

Lemma 4. Consider the execution of Random-Balls{F, R,l) that invokes Pro- 
cedure B all- Checking . For any input R and I, the maximum length of clauses 
chosen at steps 4 of Procedure Ball-Checking is less than 1. 
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Proof. The proof follows from the condition of step 2(b) of Random- Balls. More 
formally, let C be a clause of length at least I occurring in step 4 in some recursive 
call of Ball- Checking {F, A, R). Then C is a “decendent” of some clause D in F, 
i.e., C is obtained from D by removing \D\ — \C\ literals where \D\ — \C\ < R. 
The removed \D\ — \C\ literals must be true under the initial assignment A; the 
remaining \C\ literals must be false under it. □ 



Lemma 5. For any input R and I, let p be the probability (taken over random 
assignment A) that Random-Balls invokes Procedure Ball-Checking at step 2(b). 
Then we have the following bound on p: 

p < . 



Proof. We estimate the probability that a clause D in formula F meets the 
condition of step 2(b). If this condition holds, at least max(Z, \D\ — R) literals 
must be false under A. There are 




V{\D\,mm{\D\-l,R)) 



such assignments to the variables of D. Since min(|I?| — I, R) < this volume 
is at most If \D\ — I < R, the exponent transforms to 



H{l-l/\D\)\D\ < H{l-l/{l + R))\D\ = H{R/{R+l))\D\ . 



Otherwise, the exponent transforms just to H{R/\D\)\D\ < H{R/{R + 0)l-D|. 
Therefore, there are at most such assignments to the variables of 

D and at most 

2^(iJTi)l^l+"-l^l = 2(^(^)-i)l^l+” < 2'(^(«|i)-i)+” 



assignments to the variables of F. Multiplying this bound by the number of 
clauses in F and dividing by the total number 2” of assignments, we get the 
claim. □ 



Theorem 1. For R = 0.339-^71 and I = 1.87\/n, the expected running time of 
Random-Balls{F, R,l) is at most 

poly{n) . 

Proof. We need to estimate N ■ T, where N is the number of random balls used 
by the algorithm Random-Balls and T is the expected running time of search in- 
side a ball (i.e., of either Ball- Checking {F, A, R) or Full- Ball- Checking {F, A, R)). 
Using Lemma 5 and the upper bound on V{n,R), we get the following upper 
bound on T : 

T < polyin) m (pV {n, R) {1 — p) l^) 

< poly{n) m {pV{n,R) -\- l^) 

< polyin) m (^to2'(^(b^)-i)+^(#)" 
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Hence we have 

N-T<m- poly{n) ■ )" + 

= m-poly{n)- (m2"+'(^(«fi)-i) ^n-H{&)n+Riog^i^ 
< ■ poly{n) ■ 2” • (2-^ + 2"’^) 



where 



= / l-H 




and 



%P = H 




n — R log 2 I . 



Thus, we need to minimize 2““^ + 2“'^. Let us estimate (p and tp taking R = a^Jn 
and I = b^/n where a < 6. In the estimation we use the fact that ln(l + a;) = 
X + o(x) for small x: 



(j) = b\/n 1 — iJ 



a 

a + b 

a, 0+5 5, 0+5 

= b^/n 1 — log2 — log2 

a-\- 0 a a-\- 0 b 



by/n ( 



l-log 2 (o + 5) + 



O log 2 0+5 log 2 5 
0 + 5 



ip = H 



m 



n — oVnlog2(5\/n) 



, o 

= ~r — 

' /n a 



In — a ^ 



n — o\/nlog2(5-v/n) 



= o-\/nlog2 + y/n{y/n — a){log2e)ln ( 1 -\ ^ ^ — o-\/nlog2(5-\/n) 

o \ yn — a J 

y/n 

= a-v/n logo h a^n logo e — a-^n logo (fo-y/n) + o{^/n) 

a 

g 

= o-^n log2 — + o(Vn) . 
ab 



Taking o = 0.339 and 5 = 1.87, we get 4>,tp > 0.712y/n, which gives us the stated 
overall upper bound. □ 



4 Derandomization 

In this section we describe the derandomization of our algorithm. The only 
part of the randomized algorithm where random bits are used is the choice of 
initial assignments. Our deterministic algorithm (Algorithm Deterministic- Balls 
described below) chooses initial assignments from a covering code (see Sect. 2). 
Such code can be, for example, constructed by a greedy algorithm, as formulated 
in the following lemma. 
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Lemma 6 ([3]). Let d > 2 be a divisor of n > 1, and 0 < R < n/2. 
Then there is a polynomial qd{n) such that a covering code of length n, ra- 
dius at most R, and size at most qd{n) ■ can he constructed in time 

qd{n) ( 2 ^"/^ + . 



Algorithm Deterministic- Balls{F, R, 1) 

Input: formula F over n variables, numbers R and I such that 0 < R < I < n. 
Output: satisfying assignment or “no”. 

1. Let C be a covering code of length n and radius R constructed in Lemma 6. 
For each assignment A G C do the following: 

If F contains a clause that has at least I literals falsified by A and at most 
R literals satisfied by A, invoke Full-Ball-Checking{F, A, R) . Otherwise 
invoke Ball- Checking {F, A, R). If the invoked procedure finds a satisfying 
assignment, return it. 

2. Return “no”. 



Theorem 2. Taking R = log^ e ^ “ *° 2 ^ ^ log 2 Algorithm 

Deterministic -Balls runs on F, R, and I in time at most 

polyin) . 

Proof. For each ball, the algorithm invokes one of the two procedures: either 
Full- Ball- Checking or B all- Checking . Let b\ be the number of balls for which 
Full- Ball- Checking is called and &2 be the number of balls where Ball-Checking 
is called. Lemma 5 gives the upper bound on bi\ 

h\ < p2"^ = . 

In each of these 61 balls, the algorithm examines at most V (n, R) assignments. 
The number &2 is obviously not greater than the size of C: 

62 < poly{n) 2"/y (n, R) . 

In each of these 62 balls, the algorithm examines at most l^ assignments. There- 
fore, the total number of examined assignments can be estimated as follows: 

bi ■ V{n, i?) + &2 • /^ < m2'(^(Tm)-i)+"+^(f)" + polyin) 2"+«i°S2 )" 

= m 2'®^ -I- poly in) 2^'^ . 

We now estimate the exponents and S 2 taking R = ^^/n and I = A^/n 
where Z\ is a function of n such that Z\ > y /2 for sufficiently large n. Due to this 
condition on A, we have I > 2R and therefore 1)) < HiR/l). We get 
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Si = l [H 



— 1 + n + H [ — ] n 
/ \ n J 



<l +n + H[ - n 



= A^- — +n + H 



= n-y/n - log 2 Z\ + Z\ + Z\ - — j loga - 

-i log,(/\^) + (^- 3) log. (1 - 3^)) 



= 71— \/n 



-^log^A + A-Ai^l-^^ -^log^e 



-;TAlog 2 n- v^- ^ 



n 1 



A ) Ayjn 



logs e + o — 



/ 3 1 

= n- sfh - f log 2 Z\ + Z\- — logn+ O 



Substituting A = b^\og^ n, we get 

Si = n- Vn ^^i/logan- ^ 

= n- y^nlog 2 n(^S- ^ + o(l)^ ■ 

For S > l/-\/2, we have Si < n — c^n log 2 n, where c is a positive constant. 
We now estimate S 2 as follows: 



S 2 = n + R log 2 I — H i — ] n 



= ^ ^ log2{Ay/n) - ^ log2(ZiVn) - Vn log2 ^ 



A^ 
A^/n — 1 



= n — \/n 



A^/n — 1 

z 



log 2 e In 1 + 



A^/rl — 1 



Vn log 2 e 



-Vn-ol - 



Taking A = 5^J\og2^, we have S 2 <n— ( (log 2 e) f 6) y^nj log 2 n. Since S 2 domi- 
nates S'!, the total number of examined assignments is at most 

log2 ® / n 

m2°^ + poly{n)2°'^ < poly{n) m2” ~ V iog 2 " 

where 6 > I/V 2 . If we take S = (1/2) log 2 e > 1/V2, we get the claim. □ 
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Remark 1. In the proof of Theorem 2 one could take 5 arbitrarily close to l/-\/2 

getting the bound poly{n) rn^ >« 

for any e > 0. To improve the bound even more, one could construct a code 

with proportion 0{p) of balls where Full- Ball- Checking is invoked. Such a code 

exists] however, we leave constructing it as an open question. 
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Abstract. In this paper we study the complexity of the maximum con- 
straint satisfaction problem (Max CSP) over an arbitrary finite domain. 
We describe a novel connection between this problem and the supermod- 
ular function maximization problem (which is dual to the submodular 
function minimization problem). Using this connection, we are able to 
identify large classes of efficiently solvable subproblems of Max CSP aris- 
ing from certain restrictions on the constraint types. Until now, the only 
known polynomial-time solvable cases for this form of optimization prob- 
lem were restricted to constraints over a 2- valued (Boolean) domain. Here 
we obtain the first examples of general families of efficiently solvable cases 
of Max CSP for arbitrary finite domains, by considering supermodular 
functions on finite lattices. Finally, we show that the equality constraint 
over a non-Boolean domain is non-supermodular, and, when combined 
with some simple unary constraints, gives rise to cases of Max CSP which 
are hard even to approximate. 



1 Introduction 

The main object of our study in this paper is the maximum constraint satis- 
faction problem (Max CSP) where one is given a collection of constraints on 
overlapping sets of variables and the goal is to find an assignment of values to the 
variables that maximizes the number of satisfied constraints. A number of classic 
optimization problems including Max fc-SAT, Max Cut and Max Dicut can 
be represented in this framework, and it can also be used to model optimization 
problems arising in more applied settings, such as database design [9]. 

The Max-CSP framework has been well-studied in the Boolean case, that is, 
when the set of values for the variables is {0, 1}. Many fundamental results have 
been obtained, containing both complexity classifications and approximation 
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properties (see, e.g., [7,17,20]). In the non-Boolean case, a number of results 
have been obtained that concern approximation properties (see, e.g., [9,11]). 
However, there has so far been very little study of efficient exact algorithms and 
complexity for subproblems of non-Boolean Max CSP, and the present paper 
is aimed at filling this gap. 

We study a standard parameterized version of the Max CSP, in which re- 
strictions may be imposed on the types of constraints allowed in the instances. 
In particular, we investigate which restrictions make such problems tractable, by 
allowing a polynomial time algorithm to find an optimal assignment. This setting 
has been extensively studied and completely classified in the Boolean case [7, 
20]. In contrast, we consider here the case where the set of possible values is an 
arbitrary finite set. 

Experience in the study of various forms of constraint satisfaction [2,3,4, 
5,19] has shown that the more general form of such problems, in which the 
domain is an arbitrary finite set, is often considerably more difficult to analyze 
than the Boolean case. The techniques developed for the Boolean case typically 
involve the careful manipulation of logical formulas [7]; such techniques do not 
readily extend to larger domains. For example, Schaefer [25] obtained a complete 
classification of complexity for the standard constraint satisfaction problem in 
the Boolean case using such techniques in 1978; although he raised the question 
of generalizing this result to larger domains in the same paper, little progress 
was made for the next twenty years. 

The key step in the analysis of the standard constraint satisfaction prob- 
lem [3,4] was the discovery that the characterization of the tractable cases over 
the Boolean domain can be restated in an algebraic form [19]. This algebraic de- 
scription of the characterization has also proved to be a key step in the analysis 
of the counting constraint satisfaction problem [5] and the quantified constraint 
satisfaction problem [2]. However, this form of algebraic description does not 
provide a suitable tool for analyzing the Max CSP, which is our focus here. 

The main contribution of this paper is the first general approach to and the 
first general results about the complexity of subproblems of non-Boolean Max 
CSP. We point out that the characterization of the tractable cases of Max CSP 
over a Boolean domain can also be restated in an algebraic form, but using a 
rather different algebraic framework: we show that they can be characterized 
using the property of supermodularity. We also show how this property can be 
generalized to the non-Boolean case, and hence used to identify large families of 
tractable subproblems of the non-Boolean Max CSP. Moreover, we give some 
results to demonstrate how non-supermodularity can cause hardness of the cor- 
responding subproblem. 

The properties of sub- and supermodularity have been extensively used to 
study combinatorial optimization problems in other contexts. In particular, the 
problem of minimizing a submodular set function has been thoroughly studied, 
due to its applications across a variety of research areas [13,16,18,21,22]. The 
dual problem of maximizing a supermodular function has found interesting ap- 
plications in diverse economic models, such as supermodular games (see [28]). 
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Submodular functions defined on (products of) totally ordered sets correspond 
precisely to Monge matrices and arrays (see, for example, survey [6]) which play 
an important role in solving a number of optimization problems including trav- 
elling salesman, assignment and transportation problems [6]. Hence this paper 
also unifies, for the first time, the study of the Max CSP with many other areas 
of combinatorial optimization. 

The structure of the paper is as follows. In Section 2 we discuss the Max 
CSP problem, its Boolean case, its complexity, and the relevance of sub- and 
supermodularity. In Sections 3 and 4, we give two different generalizations for 
the (unique) non-trivial tractable case of Boolean Max CSP: one to general 
supermodular constraints on restricted types of ordered domains (distributive 
lattices), and the other to a restricted form of supermodular constraint on more 
general ordered domains (arbitrary lattices). For the second case, we are able 
to give a cubic time algorithm, based on a reduction to the Min Cut prob- 
lem. Section 5 describes an even more efficient algorithm for all binary super- 
modular constraints on a totally ordered domain and then shows that the only 
tract ability-preserving way of extending this set of constraints is with further 
supermodular functions; all other extensions give rise to hard problems. As fur- 
ther evidence that non-supermodularity causes hardness of Max CSP, Section 
6 establishes the surprising result that, in the non-Boolean case, allowing just 
the (non-supermodular) equality constraint and unary constraints gives rise to 
versions of Max CSP that are hard even to approximate. Finally, in Section 7 
we discuss our ideas in light of the results obtained, and describe possible future 
work. Proofs of all results are omitted due to space constraints. 

2 Preliminaries 

Throughout the paper, let D denote a finite set, \D\ > 1. Let denote the 
set of all m-ary predicates over D, that is, functions from D™ to {0, 1}, and let 

Ad — Um=l ■ 

Definition 1. A constraint over a set of variables V = {xi,X 2 , ■ ■ ■ ,Xn}, is an 
expression of the form /(x) where 

— f € is called the constraint function; 

— X = {xi^, . . . ,Xi^) is called the constraint scope. 

The constraint f is said to be satisfied on a tuple a = (oq,... ,Oi^) G D’” if 

/(a) = 1- 

Definition 2. An instance of Max CSP is a collection of constraints 
{/i(xi),... ,/q(xg)}, <7 > 1, over a set of variables V = {cci,... ,Xn}, where 
fi G Rd for all 1 < i < q. The goal is to find an assignment 4> : V ^ D that 
maximizes the number of satisfied constraints. 

Arguably, it is more appropriate for our purposes to consider the 0, 1 values 
taken by constraint functions as integers and not as Boolean values; the goal in 
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a Max CSP instance is then to maximize the function / : I?" — >■ Z+ (where Z+ 
is the set all non-negative integers), defined by 

<? 

f{xi,... ,Xn) = ^/*(Xj). 

i=l 

The weighted version of the Max CSP problem, in which each constraint /i(xi) 
has associated weight p, S Z+ , can be viewed as the problem of maximizing the 
function 

Q 

f{xi,... ,Xn) = ^Pi-/*(x,). 
i=l 

In fact, the two versions of Max CSP can be shown to be equivalent (as in [7, 
Lemma 7.2]). 

Throughout the paper, T will denote a finite subset of i?D, and 
Max CSP(iF) will denote the restriction of Max CSP to instances where all 
constraint functions belong to T . The central problem we consider in this paper 
is the following. 

Problem 1. Identify efficiently solvable problems of the form Max CSP(iF). 

Recall that PO and NPO are optimization analogs of P and NP; that is, 
they are classes of optimization problems that can be solved in deterministic 
polynomial time and non-deterministic polynomial time, respectively. We will 
call problems in PO tractable. An optimization problem is called NP-hard if it 
admits a polynomial time Turing reduction from some NP-complete problem. 
The approximation complexity class APX consists of all NPO problems for 
which there is a polynomial time approximation algorithm whose performance 
ratio is bounded by a constant. A problem in APX is called APX-complete if 
every problem in APX has a special approximation-preserving reduction, called 
an AP-reduction, to it. It is well-known that every APX-complete problem 
is NP-hard. For more detailed definitions of approximation and optimization 
complexity classes and reductions, the reader is referred to [1,7,23]. 
Proposition 1 . Max CSP(iF) belongs to APX for every T . 

A complete classification of the complexity of Max CSP(iF) for a two- 
element set D can be found in [7]. Before stating that result we need to give 
some definitions. 

Definition 3. An endomorphism of !F is a unary operation tt on D such that 
/(oi, . . . , Om) = 1 ^ /(’’■(oi), . . . , 7r(a„)) = 1 for all (oi, . . . , a^) G D™ and 
all f G T. We will say that T is a core if every endomorphism of if is injective 
(i.e. a permutation) . 

The intuition here is that if T is not a core then it has a non-injective endomor- 
phism 7T, which implies that, for every assignment there is another assignment 
that satisfies all constraints satisfied by f and uses only a restricted set of 
values, so the problem can be reduced to a problem over this smaller set. For 
example, if I? = {0, 1} then P is a not a core if and only if /(a, , . . . , a) = 1 for 
some a G D and all f G T. Obviously, in this case the assignment that assigns 
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the value a to all variables satisfies all constraints, so it is optimal, and hence 
Max CSP(lF) is trivial. 

Definition 4 ([7]). A function f G is called 2-monotone if it can he 

expressed as follows: 

f(xi,. . . ,x„) = I (xij A ... A XiJ V (% A ... A XjJ, 

where either of the two disjuncts may be empty (i.e., the values of s or t may he 
zero). 

Theorem 1 ([7]). Let T C i?{o,i} he a core. If every f G !F is 2-monotone, 
then (weighted) Max CSP(lF) is in PO, otherwise it is APlK-complete. 

As we announced in the introduction, the main new tools which we introduce to 
generalize (the tractability part of) this result will be the conditions of sub- and 
supermodularity. We will consider the most general type of sub- and supermod- 
ular functions, that is, those defined on a (general) lattice, as in [28]. Recall that 
a partial order C on a set D is called a lattice order if, for every x,y G D, there 
exist a greatest lower bound x Fly and a least upper bound xUy. The algebra 
£ = (D,n,U) on D with two binary operations □ and U is called a lattice, and 
we have xQy<^xny = x<^xUy = y. As is well known, every finite lattice 
has a least element and a greatest element, which we will denote by 0£ and Ic, 
respectively. (For more information about lattices, see, e.g., [10].) 

For tuples a = (oi, . . . , a„), b = (&i, . . . , 6„) in D”, let a □ b and a U b 
denote the tuples (oi n 6i, . . . , a„ n 6„) and (oi U &i, . . . , a„ U 6„), respectively. 
Definition 5. Let C = {D, □, U) he a lattice. A function f : D” — >• Z+ is called 
submodular on C if 

/(a n b) /(a U b) < /(a) -f- /(b) for all a, b G D”. 

It is called supermodular on L if 

/(a n b) -h /(a U b) > /(a) -|- /(b) for all a, b G D”. 

The sets of all submodular and supermodular functions on L, will be denoted 
Sbmod£ and Spmod£, respectively. 

Note that sub- and supermodular functions are usually defined to take values in 
K, but, in the context of Max CSP, it is appropriate to restrict the range to 
consist of non-negative integers. 

The properties of sub- and supermodularity are most often considered for 
functions defined on subsets of a set, which corresponds to the special case of 
Definition 5 where \D\ = 2. Recall that a function on subsets of a set is sub- 
modular if f{X n F) -I- f{X U F) < f{X) /(F) for all subsets X, F, and it is 
supermodular if the inverse inequality holds [13,22]. The problem of minimizing 
a submodular set function is tractable [16,18,26]. Some results have also been ob- 
tained that concern minimization of a submodular function defined on a family 
of subsets [14,16,18,26], or on a finite grid (or integer lattice) [27], or on general 
lattices [28]. 
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Observation 2 Let fi and /2 be suhmodular functions on a lattice L. 

— For any constants, Oi, 02 S , the function Oi/i + 02/2 is also suhmodular. 

— If K € Z+ is an upper bound for the values taken by fi, then the function 
f = K — fi, is supermodular. 

— The function f\ is suhmodular on the dual lattice CP obtained by reversing 
the order of C. 

( Corresponding statements also hold when the terms suhmodular and supermod- 
ular are exchanged throughout.) 

The next proposition shows that the non-trivial tractable case of Boolean Max 
CSP identified in Theorem 1 can be characterized using supermodularity. 

Proposition 2. A function f € i?{o,i} is 2-monotone if and only if it is super- 
modular. 

Proposition 2 is a key step in extending tractability results for Max CSP 
from the Boolean case to an arbitrary finite domain, as it allows us to re-state 
Theorem 1 in the following form. 

Corollary 1. Let T C i?{o,i} be a core. If IF Q Spmodj-Q j^j, then (weighted) 
Max CSP(iF) is in PO, otherwise it is APIK- complete. 



3 Supermodular Constraints on Distributive Lattices 

In this section we consider constraints given by supermodular functions on a 
finite distributive lattice. Recall that a finite lattice T> = {D, fl, U) is distributive 
if and only if it can be represented by subsets of a set A, where the operations U 
and n are interpreted as set-theoretic union and intersection, respectively [10]. 
It is well-known [10] that A can be chosen so that \A\ < \D\. Note that if T> is 
a finite distributive lattice, then the product lattice 2?" = (£)",□, U) is also a 
finite distributive lattice, which can be represented by subsets of a set of size at 
most \D\ ■ n, since every element of V can represented using at most \D\ bits. 

It was shown in [18,26] that a submodular function on a finite distributive 
lattice^ representable by subsets of an n-element set can be minimized in poly- 
nomial time in n (assuming that computing the value of the function on a given 
argument is a primitive operation) . The complexity of the best known algorithm 
is 0{n^ min {log nM , log n}) where M is an upper bound for the values taken 

by the function [18]. 

Using this result, and the correspondence between sub- and supermodular 
functions, we obtain the following general result about tractable subproblems of 
Max CSP. 

Theorem 3. Weighted Max CSP(.7^) is in PO whenever T C Spmodj, for 
some distributive lattice T> on D. 

^ referred to in [26] as a ring family. 
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It is currently not known whether submodular functions on non-distributive 
lattices can be minimized in polynomial time, and this problem itself is of interest 
due to some applications (see [18]). Obviously, any progress in this direction 
would imply that Max CSP for supermodular constraints on the corresponding 
lattices could also be solved efficiently. 

4 Generalized 2-Monotone Constraints 

In this section we give a cubic-time algorithm for solving Max CSP(lF) when T 
consists of supermodular functions of a special form which generalizes the class 
of 2-monotone Boolean constraints defined above. 

Definition 6. A function f G will be called generalized 2-monotone on a 
lattice C on D if it can he expressed as follows 

/(x) = 1 {{xi, C A . . . A {xi^ C a,J) V {{xj, □ 5^,) A . . . A {xj^ □ 

(1) 

where x = (xi, ... , x„), and , . . . ,ai^,bj-^, . . . , bj^ € D, and either of the two 
disjuncts may he empty (i.e., the value of s or t may he zero). 

It is easy to check that all generalized 2-monotone functions are supermodular 
(but the converse is not true in general). To obtain an efficient algorithm for 
Max CSP(lF) when T consists of generalized 2-monotone functions, we con- 
struct a reduction to the Min Cut problem, which is known to be solvable in 
cubic time [15]. 

To outline the reduction, we need to give some more notation and definitions. 
Recall that a principal ideal in a lattice £ is a set of the form {x G £ j x □ a}, 
for some a € C, and a principal filter (or dual ideal) is a set of the form {x G £ ] 
X 3 b}, for some & G £. For any generalized 2-monotone function /, we will call 
the first disjunct in Equation 1 of Definition 6 (containing conditions of the form 
X £ a), the ideal part of /, and the second disjunct in this equation (containing 
conditions of the form x □ &), the filter part of /. 

For any lattice £, and any c,d G C, we shall write c ^ d if c IZ d and there is 
no M G £ with c IZ m £ d. Finally, let Bb denote the set of all maximal elements 
in {x G £ 1 X 3 ^}- Now we are ready to describe the digraph used in the 
reduction. 

Definition 7. Let L he a lattice on a finite set D, and let T he a set of gener- 
alized 2-monotone functions on £. 

Let X = {pi ■ /i(xi),... , pg ■ /q(xg)}, q > I, be an instance of weighted 
Max CSP(lF), over a set of variables V = {xi,... ,x„}, and let oo denote an 
integer greater than ^ pi . 

We construct a digraph Gx as follows: 

— The vertices of Gx are as follows 

• {T, F} U {xd 1 X G P, d G £>} U {ei,G ] i = 1, 2, . . . ,q}. 
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For each fi where the ideal part is empty, we identify the vertices Ci and F. 

Similarly, for each fi where the filter part is empty, we identify the vertices 

Ci and T. 

— The arcs of Gi are defined as follows: 

• For each c < d in L and for each x € V, there is an arc from Xc to Xd 
with weight oo; 

• For each fi, there is an arc from Ci to Ci with weight pi; 

• For each fi, and each conjunct of the form x Q a in fi, there is an arc 
from Ci to Xa with weight oo; 

• For each fi, and each conjunct of the form x ^ b in fi, there is an arc 
from every where u € Bb, to Cj with weight oo. 

Arcs with weight less than oo will he called constraint arcs. 

It is easy to see that Gi is a digraph with source T and sink F. 

Example 1. Let £<> he the lattice on {0,a, 6, 1} such that 0 = Oc^, 1 = 

the “middle” elements a and b are incomparable. Consider the following instance 

I of Max CSP(5P) 

f{x,y) = Pi ■ fi{x) + P2 ■ f 2 {x) + P3 ■ hi.x,y) + pi ■ fiijj) 

where the constraint functions fi are defined as follows: 

fi{x) = 1^ {x Ga) 
f2{x) = 1^ (xAb) 

Mx,y) = l^{yGO)V{xAl) 

U{y) = 1 (y □ 1) 

Note that, in £<>; B\ = {a,b}, and Bb = {a}. One can check that the digraph 
shown in Figure 1 is the graph Gx specified in Definition 1 above. 

It can be shown that, for any instance I of weighted Max CSP(iF), the total 
weight of constraints that are not satisfied by an optimal solution is equal to the 
weight of a minimum cut in the graph Gx- The proof essentially uses the fact 
that the order C is a lattice order. Hence, we get the following result. 

Theorem 4. Let L he a lattice on a finite set D. Lf T consists of general- 
ized 2-monotone functions on L, then (weighted) Max CSP(iF) is solvable in 
0{q^ + n^\D\^) time, where q is the number of constraints and n is the number 
of variables in an instance. 

Theorem 4 shows that when the constraints in a Max CSP instance are 
described by generalized 2-monotone functions, then an optimal solution can be 
found much more efficiently than by invoking the general algorithm for submod- 
ular function minimization. 

This result suggests that it may be worthwhile to look for other forms of 
constraint for which there exist efficient optimization algorithms. In the next 
section, we show that when we consider totally ordered lattices then such efficient 
algorithms can be obtained for a wide range of supermodular constraints. 
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Fig. 1. The digraph Gi corresponding to the Max CSP instance defined in Example 1. 
Dashed lines denote constraint arcs, and solid lines denote arcs of weight oo. 



5 Binary Supermodular Constraints on a Chain 

In this section we consider supermodular functions on a finite totally ordered 
lattice, or chain. One reason why chains are especially interesting in our study 
is the following lemma. 

Lemma 1. Every unary function is supermodular on a lattice C if and only if 
L is a chain. 

It is easy to see that a chain is a distributive lattice, which implies that 
Theorem 3 can be applied, and hence that Max CSP(lF) is tractable for all sets 
T consisting of supermodular constraints on a chain. Furthermore, by Lemma 1, 
such sets of functions can include all unary functions. 

We will now show that, for supermodular constraints which are at most 
binary, this result can be further strengthened, to obtain a more efficient special- 
purpose optimization algorithm. We obtain this result by using a reduction to the 
Min Cut problem as in the preceding section, but in this case we can eliminate 
the dependence of the running time on the number of constraints. The proof 
uses the fact that binary submodular functions on a chain precisely correspond 
to square Monge matrices and hence can be decomposed into a sum of simpler 
functions [6]. 

Theorem 5. Let C he a chain on a finite set D. If IF C Spmod^, and each 
f & T is at most binary, then Max CSP(iF) is solvable in 0{n^\D\^) time, 
where n is the number of variables in an instance. 

The next theorem is the main result of this section. It shows that the only 
tract ability-preserving way of extending the set T from Theorem 5 is with further 
supermodular functions; all other extensions give rise to hard problems. 
Theorem 6. Let C he a chain on a finite set D, and let T C contain all 
binary supermodular functions on C . 

If ^ ^ Spmodg, then (weighted) Max CSP(iF) is in PO, otherwise it is 
NP-hard. 
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6 A Simple Non-supermodular Constraint 

We have established in the previous section that for chains, in the presence of all 
binary supermodular functions, supermodularity is the only possible reason for 
tractability. It can be shown using results of [24] that the binary supermodular 
functions on a finite chain determine the chain (up to reverse order) . However, 
by Lemma 1, all unary functions are supermodular on every chain. It is therefore 
an interesting question to determine whether supermodularity on a chain is the 
only possible reason for tractability of Max CSP(iF) when T contains all unary 
functions. 

In this section we give some evidence in favour of a positive answer to this 
question, by considering a simple equality constraint. Interestingly, in all of the 
various versions of the constraint satisfaction problem for which complexity clas- 
sifications have previously been obtained, an equality constraint can be combined 
with any tractable set of constraints without affecting tractability. However, we 
show here that such a constraint gives rise to hard subproblems of Max CSP, 
in the presence of some simple unary constraints. 

Definition 8. Let D he a finite set. We define the function feq € Rfi , and the 
functions Cd G for each d G D, as follows 

feq{x, y) = 1^ (x = y) and Cd{x) = 1 (x = d). 

It is easy to check that feq on D is supermodular if \D\ = 2. However, the next 
result shows that \D\ = 2 is the only case for which this is true. 

Lemma 2. If \D\ > 2 then feq{x,y) is not supermodular on any lattice on D. 
Note that Max CSP({/eg}) is clearly tractable. However, this does not give us 
an interesting tractable subproblem of Max CSP, since {feq} is not a core. In 
fact, the core obtained from {feq} is one-element. 

The next theorem shows that the equality constraint feq, when considered 
together with the set of unary functions Cd (to make a core) , gives rise to a hard 
problem. 

Theorem 7. For any finite set D with \D\ >2, if T {cd \ d G D} U {feq}, 
then Max CSP(iF) is APIS.- complete. 

The proof is by reduction from the APX-complete Minimum 3-terminal 
Cut problem [8]. In fact, in Theorem 7, it is enough to require that T contains 
at least three functions of the form c^. 

7 Conclusion 

We believe that the most interesting feature of the research presented in this 
paper is that it brings together several different methods and directions in com- 
binatorial optimization which have previously been studied separately: Max 
CSP, submodular functions, and Monge properties. We hope that the ideas and 
results presented here will stimulate research in all of these areas, and perhaps 
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also impact on other related areas of combinatorial optimization. In particular, 
the problem of minimizing submodular functions on non-distributive lattices 
becomes especially important in view of the links we have discovered. 

The close connection we have established between tractable cases of Max 
CSP and the property of supermodularity leads us to conjecture that super- 
modularity is the only possible reason for tractability in Max CSP. Regardless 
of whether this conjecture holds, the results we have given above demonstrate 
that significant progress can now be made in developing efficient algorithms for 
all the known tractable cases of Max CSP by exploiting the large body of exist- 
ing results concerning sub- and supermodularity, and Monge properties (e.g., [6, 
24,28]). 

One possible direction to extend our results would be a further study of the 
approximability of constraint satisfaction problems over arbitrary finite domains. 
For example, the techniques presented here can be further fine-tuned to establish 
APX-completeness for at least some of the remaining NP-hard cases of Max 
CSP. However, to complete the study of approximability properties, it is likely 
to be necessary to define appropriate notions of expressiveness for a given set of 
constraint functions, and this has previously only been developed for the Boolean 
case [7]. 
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Abstract. We consider the Boolean constraint isomorphism problem, 
that is, the problem of determining whether two sets of Boolean con- 
straint applications can be made equivalent by renaming the variables. 

We show that depending on the set of allowed constraints, the problem 
is either coNP-hard and Gl-hard, equivalent to graph isomorphism, or 
polynomial-time solvable. This establishes a complete classification of the 
complexity of the problem, and moreover, it identifies exactly all those 
cases in which Boolean constraint isomorphism is polynomial-time many- 
one equivalent to graph isomorphism, the best-known and best-examined 
isomorphism problem in theoretical computer science. 

1 Introduction 

Constraint satisfaction problems (or, constraint networks) were introduced in 
1974 by U. Montanari to solve computational problems related to picture pro- 
cessing [26]. It turned out that they form a broad class of algorithmic problems 
that arise naturally in different areas [20]. Today, they are ubiquitous in com- 
puter science (database query processing, circuit design, network optimization, 
planning and scheduling, programming languages), artificial intelligence (belief 
maintenance and knowledge based systems, machine vision, natural language 
understanding), and computational linguistics (formal syntax and semantics of 
natural languages). 

A constraint satisfaction instance is given by a set of variables, a set of values 
that the variables may take (the so-called universe), and a set of constraints. 
A constraint restricts the possible assignments of values to variables; formally a 
/c-place constraint is a fc-ary relation over the universe. The most basic question 
one is interested in is to determine if there is an assignment of values to the 
variables such that all constraints are satisfied. 

* Research supported in part by grants NSF-INT-9815095/DAAD-315-PPP-gii-ab, 
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This problem has been studied intensively in the past decade from a computa- 
tional complexity point of view. In a particular case, that of 2-element universes, 
a remarkable complete classification was obtained, in fact already much earlier, 
by Thomas Schaefer [28]. Note that in this case of a Boolean universe, the vari- 
ables are propositional variables and the constraints are Boolean relations. A 
constraint satisfaction instance, thus, is a propositional formula in conjunctive 
normal form where, instead of the usual clauses, arbitrary Boolean relations may 
be used. In other words, the constraint satisfaction problem here is the satisfia- 
bility problem for generalized propositional formulas. Obviously the complexity 
of this problem depends on the set C of constraints allowed, and is therefore 
denoted by CSP(C) (C will always be finite in this paper). In this way we obtain 
an infinite family of NP-problems, and Schaefer showed that each of them is 
either NP-complete or polynomial-time solvable. This result is surprising, since 
by Ladner’s Theorem [25] there is an infinite number of complexity degrees be- 
tween P and NP (assuming P yf NP), and consequently it is well conceivable that 
the members of an infinite family of problems may be located anywhere in this 
hierarchy. Schaefer showed that for the generalized satisfiability problem this is 
not the case: Each CSP(C) is either NP-complete, that is in the highest degree, 
or in the lowest degree P. Therefore his result is called a dichotomy theorem. 

For larger universes, much less is known. Satisfiability of constraint net- 
works is always in NP, and for large families of allowed sets of constraints, 
NP-completeness was proven while for others, tractability (i.e., polynomial-time 
algorithms) was obtained. Research in this direction was strongly influenced 
by the seminal papers [15,13], and many deep and beautiful results have been 
proven since then, see, e. g., [14,24,16,4]. Only recently, a dichotomy theorem for 
the complexity of satisfiability of constraint networks over 3-element universes 
was published [6] , but for larger domains such a complete classification still seems 
to be out of reach. For Boolean universes, however, a number of further compu- 
tational problems have been addressed and in most cases, dichotomy theorems 
were obtained. These problems concern, among others, the problems to count 
how many satisfying solutions an instance has [7], to enumerate all satisfying 
solutions [8], to determine in certain ways optimal satisfying assignments [10, 
18,27], to determine if there is a unique satisfying assignment [17], learnability 
questions related to propositional formulas [11], the inverse satisfiability problem 
[21], and the complexity of propositional circumscription [19,12]. Results about 
approximability of optimization problems related to Boolean CSPs appeared in 
[31,23]. We point the reader to the monograph [9] that discusses much of what 
is known about Boolean constraints. 

In this paper, we address a problem that is not a variation of satisfiability, 
namely, the isomorphism problem for Boolean constraints. Perhaps the most 
prominent isomorphism problem in computational complexity theory is the graph 
isomorphism problem, GI, asking given two graphs if they are isomorphic. Graph 
isomorphism has been well studied because it is one of the very few problems 
in NP neither known to be NP-complete nor known to be in P (in fact, there is 
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strong evidence that GI is not NP-complete, see [22]); thus GI may be in one of 
the “intermediate degrees” mentioned above. 

Another isomorphism problem studied intensively in the past few years is the 
propositional formula isomorphism. This problem asks, given two propositional 
formulas, if there is a renaming of the variables that makes both equivalent. The 
history of this problem goes back to the 19th century, where Jevons and Glifford, 
two mathematicians, were concerned with the task to construct formulas or cir- 
cuits for all n-ary Boolean functions, but since there are too many (2^ ) of them 
they wanted to identify a small set of Boolean circuits from which all others 
could then be obtained by some simple transformation. This problem has been 
referred to since as the “Jevons-Glifford-Problem.” One of the transformations 
they used was renaming of variables (producing an isomorphic circuit), another 
one was first renaming the variables and then negating some of them (producing 
what has been called a congruent circuit). Hence it is important to know how 
many equivalence classes for isomorphism and congruence there are, and how 
to determine if two circuits or formulas are isomorphic or congruent. (A more 
detailed discussion of these developments can be found in [29, pp. 6-8].) How- 
ever, the exact complexity of the isomorphism problem for Boolean circuits and 
formulas (the congruence problem turns out to be of the same computational 
complexity; technically: both problems are polynomial-time many-one equiva- 
lent) is still unknown: It is trivially hard for the class coNP (of all complements 
of NP-problems) and in (the second level of the polynomial hierarchy), and 
Agrawal and Thierauf showed that it is most likely not A^-hard (that is, unless 
the polynomial hierarchy collapses, an event considered very unlikely by most 
complexity-theorists) [2] . 

In this paper we study the Boolean formula isomorphism problem restricted 
to formulas in the Schaefer sense, in other words: the isomorphism problem for 
Boolean constraints. In a precursor, the present authors showed that this problem 
is either coNP-hard (the hard case, the same as for general formula isomorphism) 
or reducible to the graph isomorphism problem (the easy case) [3]. This result 
is not satisfactory, since it leaves the most interesting questions open: Are there 
“really easy” cases for which the isomorphism problem is tractable (that is, in 
P)? What exactly are these? And are the remaining cases which reduce to graph 
isomorphism actually equivalent to GI? 

The present paper answers these questions affirmatively. To state precisely 
our main result (Theorem 7) already here (formal definitions of the relevant 
classes of constraints will be given in the next section), constraint isomorphism 
is coNP-hard and Gl-hard for classes C of constraints that are neither Horn 
nor anti-Horn nor affine nor bijunctive, it is in in P if C is both affine and 
bijunctive, and in all other cases, the isomorphism problem is equivalent to graph 
isomorphism. This classification holds for constraint applications with as well as 
without constants. As in the case of Schaefer’s dichotomy, we thus obtain simple 
criteria to determine, given C, which of the three cases holds. This theorem gives 
a complete classification of the computational complexity of Boolean constraint 
isomorphism. Moreover, it determines exactly all those cases of the Boolean 
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constraint isomorphism problem that are equivalent to graph isomorphism, the 
most prominent and probably most studied isomorphism problem so far. 

The next section formally introduces constraint satisfaction problems and 
the relevant properties of constraints. Section 3 then contains the proof of our 
main theorem: In Section 3.1 we identify those classes of constraints for which 
isomorphism is in P and Section 3.2 contains the main technical contribution 
of this paper proving Gl-hardness for all other cases. Due to space restrictions, 
most proofs in this paper had to be omitted. We refer the reader to the ACM 
Computing Research Repository Report cs.CC/0306134. 

2 Preliminaries 

We start by formally introducing constraint problems. The following section is 
essentially from [3], following the standard notation developed in [9]. 

Definition 1. 1. A constraint C (of arity k) is a Boolean function from {0, 1}^ 

to {0, 1}. 

2. If C is a constraint of arity k, and X\,X 2 , ■ ■ ■ ,Xk are (not necessarily distinct) 
variables, then C{xi,X 2 -, ■ ■ ■ ,Xk) is a constraint application of C. In this 
paper, we view a constraint application as a Boolean function on a specific 
set of variables. Thus, for example, V 0:2 = X 2 V xi 

3. If C is a constraint of arity k, and for 1 < t < fc, is a variable or a 
constant (0 or 1), then C{xi,X 2 , ■ ■ ■ ,Xk) is a constraint application of C 
with constants. 

4. If A is a constraint application [with constants], and X a set of variables 
that includes all variables that occur in A, we say that A is a constraint 
application [with constants] over variables X. Note that we do not require 
that every element of X occurs in A. 

The complexity of Boolean constraint problems depends on those properties 
of constraints that we define next. 

Definition 2. Let C be a constraint. 

— C is 0-valid if C(0) = 1. Similarly, C is 1-valid if C(l) = 1. 

— C is Horn (or weakly negative) [anti-Horn (or weakly positive)] if C is equiva- 
lent to a CNF formula where each clause has at most one positive [negative] 
literal. 

— C is bijunctive if C is equivalent to a 2CNF formula. 

— C is affine if C is equivalent to an XOR-CNF formula. 

— C is 2-affine (or, affine with width 2) if C is equivalent to a XOR-CNF 
formula such that every clause contains at most two literals. 

Let C be a finite set of constraints. We say C is 0-valid, 1-valid, Horn, anti-Horn, 
bijunctive, or affine if every constraint (7 G C is 0-valid, 1-valid, Horn, anti-Horn, 
bijunctive, or affine, respectively. Finally, we say that C is Schaefer if C is Horn 
or anti-Horn or affine or bijunctive. 
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The question studied in this paper is that of whether a set of constraint 
applications can be made equivalent to a second set of constraint applications 
using a suitable renaming of its variables. We need some definitions. 

Definition 3. 1. Let S' be a set of constraint applications with constants over 

variables X and let tt be a permutation of X. By 7r(S) we denote the set 
of constraint applications that results when we replace simultaneously all 
variables a; in S by 7r(a;). 

2. Let S be a set of constraint applications over variables X. The number of 
satisfying assignments of S, #i(S), is defined as ||{ / | / is an assignment to 
all variables in X that satisfies every constraint application in S'}!!. 

The isomorphism problem for Boolean constraints, first defined and examined 
in [3] is formally defined as follows. 

Definition 4. 1. ISO(C) is the problem of, given two sets S and U of constraint 

applications of C over variables X, to decide whether S and U are isomorphic, 
i.e., whether there exists a permutation tt of X such that 7r(S) is equivalent 
to U. 

2. ISOc(C) is the problem of, given two sets S and U of constraint applica- 
tions of C with constants over variables X, to decide whether S and U are 
isomorphic. 

Bohler et al. obtained results about the complexity of the just-defined prob- 
lem that, interestingly, pointed out relations to another isomorphism problem: 
the graph isomorphism problem (GI). 

Definition 5. GI is the problem of, given two graphs G and H, to determine 
whether G and H are isomorphic, i.e., whether there exists a bijection tt: V (G) — >■ 
V{H) such that for all v,w € V{G), {u, w} G E{G) iff {tt{v), 7r(t(;)} G E{H). Our 
graphs are undirected, and do not contain self-loops. We also assume a standard 
enumeration of the edges, and will write E{G) = {ei, . . . , Cm}- 

GI is a problem that is in NP, not known to be in P, and not NP-complete 
unless the polynomial hierarchy collapses. For details, see, for example, [22]. 
Recently, Toran showed that GI is hard for NL, PL, Mod^L, and DET under 
logspace many-one reductions [30]. Arvind and Kurur showed that GI is in the 
class SPP [1], and thus, for example in ©P. 

The main result from [3] can now be stated as follows. 

Theorem 6. Let C he a set of eonstraints. If C is Schaefer, then ISO(C) and 
ISOc(C) are polynomial-time many-one reducible to GI, otherwise, ISO(C) and 
ISOc(C) are coN¥ -hard. 

Note that if C is Schaefer the isomorphism problems ISO(C) and ISOc(C) 
cannot be coNP-hard, unless NP = coNP. (This follows from Theorem 6 and 
the fact that GI is in NP.) Under the (reasonable) assumption that NP ^ coNP, 
and that GI is neither in P, nor NP-complete, Theorem 6 thus distinguishes a 
hard case (coNP-hard) and an easier case (many-one reducible to GI). 

Bohler et al. also pointed out that there are some bijunctive, Horn, or affine 
constraint sets C for which actually ISO(C) and ISOc(C) are equivalent to graph 
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isomorphism. On the other hand, certainly there are C for which ISO(C) and 
ISOc(C) are in P. In the upcoming section we will completely classify the com- 
plexity of ISO(C) and ISOc(C), obtaining for which C exactly we are equivalent 
to GI and for which C we are in P. 



3 A Classification of Boolean Constraint Isomorphism 

The main result of the present paper is a complete complexity-theoretic classi- 
fication of the isomorphism problem for Boolean constraints. 

Theorem 7. Let C he a finite set of constraints. 

1. If C is not Schaefer, then ISO(C) and ISOc(C) are coNP-/iord and Gl-hard. 

2. If C is Schaefer and not 2-affine, then ISO(C) and ISOc(C) are polynomial- 
time many-one equivalent to GI. 

3. Otherwise, C is 2-affine and ISO(C) and ISOc(C) are in P. 

The rest of this section is devoted to a proof of this theorem and organized as 
follows. The coNP lower-bound part from Theorem 7 follows from Theorem 6. 
In Section 3.1 we will prove the polynomial-time upper bound if C is 2-affine 
(Theorem 10). The GI upper bound if C is Schaefer again is part of Theorem 6. In 
Section 3.2 we will show that ISOc(C) is Gl-hard if C is not 2-affine (Theorems 15 
and 17). Theorem 18 finally shows that ISO(C) is Gl-hard if C is not 2-affine. 

3.1 Upper Bounds 

A central step in our way of obtaining upper bounds is to bring sets of constraint 
applications into a unique normal form. This approach is also followed in the 
proof of the coIP[2]'^^ upper bound^ for the isomorphism problem for Boolean 
formulas [2] and the GI upper bound from Theorem 6 [3] . 

Definition 8. Let C be a set of constraints, nf is a normal form function for C 
if and only if for all sets S and U of constraint applications of C with constants 
over variables X , and for all permutations tt of X, 

1. nf{S,X) is a set of Boolean functions over variables X, 

2. S = nf{S,X) (here we view S' as a set of Boolean functions, and define 
equivalence for such sets as logical equivalence of corresponding propositional 
formulas), 

3. n/(7r(S), A) = 7r(n/(S, A)), and 

4. if S = U, then nf{S,X) = nf{U,X) (here, “=” is equality between sets of 
Boolean functions). 

It is important to note that nf{S,X) is not necessarily a set of constraint appli- 
cations of C with constants. 

^ Here IP [2] means an interactive proof system where there are two messages ex- 
changed between the verifier and the prover. 
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An easy property of the definition is that S = U iS nf{S,X) = nf{U,X). 
Also, it is not too hard to observe that using normal forms removes the need to 
check whether two sets of constraint applications with constants are equivalent, 
more precisely: S is isomorphic to U iff there exists a permutation tt of A such 
that 7r(n/(S')) = nf{U). 

There are different possibilities for normal forms. The one used by [3] is 
the maximal equivalent set of constraint applications with constants, defined by 
nf{S,X) to be the set of all constraint applications A of C with constants over 
variables X such that S' — >■ A. For the P upper bound for 2-affine constraints, 
we use a normal form described in the following lemma. Note that this normal 
form is not necessarily a set of 2-affine constraint applications with constants. 

Lemma 9. LetC be a set of 2-affine constraints. There exists a polynomial-time 
computable normal form function nf for C such that for all sets S of constraint 
applications of C with constants over variables X, the following hold: 

1. If S = 0, then nf{S,X) = {0}. 

2. If S ^ 0, then nf{S,X) = {Z,0} U ^ V (A^ A Yi)}, where 

Z, O, Al, Yi, . . . , A^, Yi are pairwise disjoint subsets of X such that XiLlYi yf 
0 for all 1 < i < £, and for W a set of variables, W in a formula denotes 
!\W , and W denotes ^\J W . 

Making use of the normal form, it is not too hard to prove our claimed upper 
bound. 

Theorem 10. Let C be a set of constraints. If C is 2-affine, then ISO(C) and 
ISOc(C) are in P. 

Proof. Let S and U be two sets of constraint applications of C and let A 
be the set of variables that occur in S U U. Use Lemma 9 to bring S and 
U into normal form. Using the first point in that lemma, it is easy to check 
whether S or U are equivalent to 0. For the remainder of the proof, we now 
suppose that neither S nor U is equivalent to 0. Let Z, O, Ai, Yi, . . . , A^, Yi and 
Z' , O' , X[,Y(, ... ,X'j^,Yf. be subsets of A such that: 

1. Z,0,Xi,Yi, . . . ,Xi,Yi are pairwise disjoint and Z', O', A(, Y/, . • • , A^, Y^ 
are pairwise disjoint, 

2. Ai U Yi 0 for all 1 < i < f and A' U Y/ y^ 0 for all 1 < z < /c, 

3. nf[S, A) = O} UUUiiiX, AYf)v(jQA Yi)}, and nf{U, A) = (X, O'] U 
{Jhm A Y/) V (A' A Yf)}. 

We need to determine whether S is isomorphic to U. Since nf is a normal 
form function for C, it suffices to check if there exists a permutation tt on A such 
that 7r(n/(S', A)) = nf{U,X). Note that 

i 

TT{nf{S,X)) = {7r(Z),7r(0)} U |J{(7r(Ai) A 7r(Yi)) V (7r(Ai) A 7r(Yi))}. 

2=1 
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It is immediate that TT{nf{S, X)) = nf{U,X) if and only if 

— £ = k, tt{Z) = Z', 7r(0) = O', and 

- {{7t(Xi), 7T(yi)}, ■ • ■ , wx,), 7T(y,)}} = {{X'„Y{}, {X'„ r/}}. 

Since Z,0,Xi,Y\, . . . ,Xi,Y(, are pairwise disjoint subsets of X, and since 
Z' , O', X'i,Y(, . . . ,X'i.,Yl are pairwise disjoint subsets of X, it is easy to see that 
there exists a permutation tt on X such that nf{n{S),X) = nf{U,X) if and only 
if 



-£=k, ||Z|| = ||Z'||, ||0|| = ||0'||, and 

- [{||Xi|U|yi||},...,{||Xfc|U|y,||}] = [{||x;iu|y/||},...,{||x'||,||y'||}]; 
here [• • •] denotes a multi-set. 

It is easy to see that the above conditions can be verified in polynomial time. 
It follows that ISO(C) and ISOc(C) are in P. □ 



3.2 GI-Hardness 

In this section, we will prove that if C is not 2-affine, then GI is polynomial-time 
many-one reducible to ISOc(C) and ISO(C). As in the upper bound proofs of the 
previous section, we will often look at certain normal forms. In this section, it is 
often convenient to avoid constraint applications that allow duplicates. 

Definition 11. Let C be a set of constraints. 

1. A is a constraint application of C without duplicates if there exists a constraint 
C G C of arity k such that A = C{xi, . . . , Xk), where Xi yf xj for all i yf j. 

2. Let S' be a set of constraint applications of C [without duplicates] over vari- 
ables X. We say that S is a maximal set of constraint applications of C 
[without duplicates] over variables X if for all constraint applications A of 
C [without duplicates] over variables A, if S — >■ A, then A G S. 

If X is the set of variables occurring in S, we will say that S is a maximal 
set of constraint applications of C [without duplicates] . 

The following lemma is easy to see. 

Lemma 12. Let C he a set of constraints. Let S and U he maximal sets of 
eonstraint applications of C over variables X [without duplicates] . Then S is 
isomorphie to U iff there exists a permutation tt of X such that tt{S) = U. 

Note that if C is not 2-affine, then C is not affine, or C is affine and not 
bijunctive. We will first look at some very simple non-affine constraints. 

Definition 13 ([9, p. 20]). 

1. ORo is the constraint Xxy.x V y. 

2. ORi is the constraint Xxy.x y y. 

3. OR 2 is the constraint Xxy.x y y. 

4. OneInThree is the constraint Xxyz.{x f\y f\z) V {x Ay A z)y {x Ay A z). 
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As a first step in the general Gl-hardness proof, we show that GI reduces 
to some particular constraints. The reduction of GI to ISO({ORo}) already ap- 
peared in [5]. Reductions in the other cases follow similar patterns. 

Lemma 14. 1. GI is polynomial-time many-one reducible to ISO({ORi}), i € 

{ 0 , 1 , 2 }. 

2. Let h he the 4-O'T'y constraint h{x, y, x', y') = (a; V y) A (x © x') A (y © y') . GI 
is polynomial-time many-one reducible to ISO({ft.|). 

3. Let h he a 6-ary constraint h{x,y,z,x',y',z') = OneInThree(a;, j/, z) A (a; © 
x') A (j/ © y') A (z © z'). Then GI is polynomial-time many-one reducible to 
ISO(IM). 

The constraints ORq, ORi, and OR 2 are the simplest non-affine constraints. 
However, it is not enough to show that GI reduces to the isomorphism problem 
for these simple cases. In order to prove that GI reduces to the isomorphism 
problem for all sets of constraints that are not affine, we need to show that all 
such sets can “encode” a finite number of simple cases. 

Different encodings are used in the lower bound proofs for different constraint 
problems. All encodings used in the literature however, allow the introduction 
of auxiliary variables. In [9], Lemma 5.30, it is shown that if C is not affine, 
then C plus constants can encode ORq, ORi, or OR 2 . This implies that, for 
certain problems, lower bounds for ORq, ORi, or OR 2 transfer to C plus con- 
stants. However, their encoding uses auxiliary variables, which means that lower 
bounds for the isomorphism problem don’t automatically transfer. For sets of 
constraints that are not affine, we will be able to use part of the proof of [9], 
Lemma 5.30, but we will have to handle auxiliary variables explicitly, which 
makes the constructions much more complicated. 

Theorem 15. LfC is not affine, then GI is polynomial-time many-one reducible 
to ISOc(C) . 

Proof. First suppose that C is weakly negative and weakly positive. Then C 
is bijunctive [7]. From the proof of [9, Lemma 5.30] it follows that there exists 
a constraint application A{x,y,z) of C with constants such that A(0,0,0) = 
A(0,1,1) = A(1,0, 1) = 1 and A(1,1,0) = 0. Since C is weakly positive, we 
also have that A(l, 1, 1) = 1. Since C is bijunctive, we have that A(0, 0, 1) = 1. 
The following truth-table summarizes all possibilities (this is a simplified version 
of [9], Glaim 5.31). 



xyz 


000 


001 


010 


oil 


100 


101 


110 


111 


A{x,y,z) 


1 


1 


a 


1 


b 


1 


0 


1 



Thus we obtain A[x, x, y) = (xV y), and the result follows from Lemma 14.1. 

So, suppose that C is not weakly negative or not weakly positive. We follow 
the proof of [9], Lemma 5.30. From the proof of [9], Lemma 5.26, it follows that 
there exists a constraint application A of C with constants such that A{x, y) = 
ORo{x,y), A{x,y) = OR 2 {x,y), or A{x,y) = x © y. In the first two cases, the 
result follows from Lemma 14.1. 
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Consider the last case. From the proof of [9], Lemma 5.30, there exist a 
set S{x,y,z,x',y',z') of C constraint applications with constants and a ternary 
function h such that S{x, y, z, x', y', z') = h{x, y, z) A (xQx') A (yOy') A (zQz'), 
h(OOO) = h(Oll) = h(lOl) = 1, and /i(llO) = 0. 

The following truth-table summarizes all possibilities: 



xyz 


000 


001 


010 


on 


100 


101 


no 


111 


h{x,y,z) 


1 


a 


b 


1 


C 


1 


0 


d 



We will first show that in most cases, there exists a set U of constraint 
applications of C with constants such that U{x,y,x',y') = (a; V y) A (x © x') A 
{y ® y')- In all these cases, the result follows from Lemma 14.2 above. 

— & = 0, d= 1. In this case, S{x,y,x,x' ,y' ,x') = {xV y') f\ (x©x') A {y®y') = 
(x V y') A (x © x') A (y © y'). 

— & = 1, d = 0. In this case, S{x, y, x, x', y', x') = (x' V y') A (x © x') A (y (B y')- 

— c = 0, d = 1. In this case, S'(x, y, y, x' , y' , y') = (x' V y) A (x © x') A (j/ © y'). 

— c = I, d = 0. In this case, S{x, y, y, x', y', y') = (x' V y') A (x © x') A (y © y'). 

— b = c = 1. In this case, S(x, y, 0, x', y', 1) = (x' V y') A (x © x') A (y © y'). 

— & =c = d = 0;a=I. In this case, S'(0, y, z, I, y', z') = (y' V z) A (y © y') A 
(z©z'). 

The previous cases are analogous to the cases from the proof of [9], Claim 5.31. 
However, we have to explicitly add the © conjuncts to simulate the negated 
variables used there, which makes Lemma 14.2 necessary. 

The last remaining case is the case where a = 5 = c= d = 0. In the proof 
of [9], Claim 5.31, it suffices to note that (yVz) = 3!x/i(x, y, z). But, since we are 
looking at isomorphism, we cannot ignore auxiliary variables. Our result uses a 
different argument and follows from Lemma 14.3 above and the observation that 
S{x, y, z, x', y', z') = OneInThree(x, y, z') A (x © x') A (y © y') A (z © z'). □ 

In the case where C is affine but not 2-affine, we first show Gl-hardness of 
a particular constraint and then turn to the general result. (The proofs, using 
similar constructions as in the proofs of Lemma 14 and Theorem 15, are given 
in the full paper.) 

Lemma 16. Let h be the 6-ary constraint such that h{x,y, z,x' ,y' , z') = (x © 
y © z) A (x © x') A (y © y') A (z © z'). GI is polynomial-time many-one reducible 
to ISO({d}). 

Theorem 17. If C is affine and not bijunctive, then GI is polynomial-time 
many-one reducible to ISOc(C). 

Finally, to finish the proof of statement 2 of Theorem 7, it remains to show 
Gl-hardness of ISO(C) for C not 2-affine. In the full paper we show that it is 
possible to remove the introduction of constants in the previous constructions 
of this section. 

Theorem 18. If C is not 2-affine, then GI ISO(C). 
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Abstract. Given a single machine and a set of jobs with due dates, 
the classical AP-hard problem of scheduling to minimize total tardi- 
ness is a well-understood one. Lawler gave an FPTAS for it some twenty 
years ago. If the jobs have positive weights the problem of minimizing 
total weighted tardiness seems to be considerably more intricate. To our 
knowledge there are no approximability results for it. In this paper, we 
initiate the study of approximation algorithms for the problem. We ex- 
amine first the weighted problem with a fixed number of due dates and 
we design a pseudopolynomial algorithm for it. We show how to trans- 
form the pseudopolynomial algorithm to an FPTAS for the case where 
the weights are polynomially bounded. For the general case with an ar- 
bitrary number of due dates, we provide a quasipolynomial randomized 
algorithm which produces a schedule whose expected value has an addi- 
tive error proportional to the weighted sum of the due dates. 



1 Introduction 

We study the problem of scheduling jobs on a single machine to minimize total 
weighted tardiness. We are given a set of n jobs. Job j, 1 < J < n, becomes 
available at time 0, has to be processed without interruption for an integer time 
Pj, has a due date dj, and has a positive weight Wj. For a given sequencing of 
the jobs the tardiness Tj of job j is defined as max{0, Cj — dj}, where Cj is the 
completion time of the job. The objective is to find a processing order of the 
jobs which minimizes 3-field notation used in scheduling the 

problem is denoted 1| | wjTj. 

According to Gongram et ah, 1| | is an “NP-hard archetypal ma- 

chine scheduling problem” whose exact solution appears very difficult even on 
very small inputs [1]. We proceed to review what is known on the complexity 
of the problem. In the case of one machine it has long been known that an op- 
timal preemptive schedule has the same total weighted tardiness as an optimal 
nonpreemptive schedule [7]. Early on the problem was shown to be NP-hard in 
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the ordinary sense by Lenstra et al. [6] when the jobs have only two distinct due 
dates by a reduction from the knapsack problem. It was shown to be strongly 
A^P-hard for an arbitrary number of due dates by Lawler [3] . Lawler and Moore 
[5] have presented a pseudopolynomial solution for the case when all jobs have a 
single common due date. From the algorithmic point of view we are not aware of 
any non-trivial approximation algorithm. The only case that seems to be better 
understood is the usually easier case of agreeable weights: in that case pj < pi 
implies Wj > Wi- Lawler gave a pseudopolynomial algorithm for the agreeable- 
weighted case [3]. In 1982 he showed how to modify that algorithm to obtain an 
FPTAS for the case of unit weights [4] . Interestingly, the complexity of the unit 
weight problem, 1 1 | Tj was an open problem for many years until Du and 
Leung showed it is NP-hard [2]. 

In this paper we make progress on the problem of minimizing total weighted 
tardiness by examining first the case where the number of distinct due dates is 
fixed. Our main contribution is a pseudopolynomial algorithm whose complexity 
depends on the total processing time. This implies that the problem is in V when 
the processing times are polynomially bounded. We then show how to modify the 
pseudopolynomial algorithm in two steps: first so that its complexity depends 
on the maximum tardiness and second so that it yields an FPTAS when the 
maximum job weight is bounded by a polynomial in n. Our main approach is 
based on viewing the problem as having to pack the jobs into a finite number of 
bins where the cost of each job depends on which bin it is assigned to and some 
jobs may be split between two bins. Hopefully some of the ideas we introduce 
could be of use for further study of approximating the long-open general case 
with an arbitrary number of due dates. 

For the general case with an arbitrary number of distinct due dates we give a 
result that may be of interest when the due dates are concentrated around small 
values. Under the assumption that the maximum processing time is bounded by a 
polynomial in n, we provide a quasipolynomial randomized algorithm which pro- 
duces a schedule whose expected value has an additive error equal to Wjdj 
for any fixed 5 > 0. We obtain this result by combining a partition of the time 
horizon into geometrically increasing intervals with a random shift of the due 
dates. To our knowledge this type of randomized input perturbation has not 
been used before in a scheduling context. 

2 Pseudopolynomial Algorithm 

Let us assume that the jobs have been numbered in weighted shortest processing 
time (WSPT) order, i.e., pi/wi < P 2 /W 2 < ... < Pn/wn- We call a job straddling 
a due date, or straddling in short, if it is the last job to start before that due 
date. A job is early in a schedule if its processing is completed by its due date, 
and a job is tardy if it completes after its due date. 

First we develop the pseudopolynomial solution for the problem with two 
fixed due dates D\ < D 2 . We say that job j belongs to job class m if dj = Dm 
for m G {1, 2}. We denote these job classes by Cm for m £ {1, 2}. Our approach 
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could be viewed as a generalization to multiple due dates of the classical approach 
of Lawler and Moore [5] which was known to work only for the single-due-date 
case before. As it is observed in [5], there appears to be no way to identify the 
straddling jobs in an optimal schedule before finding that schedule. Therefore 
we are going to enumerate all possible pairs. Let fci, ^2 be the (fixed) straddling 
jobs and let Sk^ and Sk 2 be their starting times in a schedule. Note that if 
Ski +Pki > D 2 , then the second straddler does not exist and we denote this case 
by fca = <P. Define P = and m=l,2, and 

1 < i < n. The fixed due dates partition the total processing interval [0, P] into 
subintervals Ii = [0,Pi],/2 = [Di,D 2 ] and I 3 = [P2,P]- These intervals can be 
viewed as bins where we have to pack the jobs. Let be the total processing 
time of early jobs from scheduled in interval A for m S {1,2} and 

1 < i < m. Similarly, let be the total processing time of tardy jobs from 
Cm\{ki,k 2 } scheduled in interval A for m G {1,2} and m -I- 1 < t < 3. Let 
Em = Ei^i represent the total amount of early processing for jobs from 
Cm\{ki,k 2 }- A Gantt chart showing this partition of a partial schedule for the 
first j jobs is in Fig. 1. Since any early job scheduled from job class 2 in I 2 
remains early anywhere in /2, any job that is scheduled to be early in I 2 must 
follow all jobs which are scheduled tardy in J2. In other words, the jobs in the 
part t2,i precede the jobs in the part 62,2- Notice that in a partial schedule for 
the first j jobs (j = 1, 2, ..., n), we always have 

fo,i = P( - t2,i - El and fa, 2 = Pi ~ E2. ( 1 ) 



ei,i + ei .2 

0 



Pki 



G,i 



Ski Di 



62,2 



Pk2 p3,l + G.a 



Sk2 D 2 



~r~ 

p 



time 



Fig. 1. The Gantt chart for a typical partial schedule with two due dates. 



Observe that when k 2 = then ^2,1 = 0 and 62,2 = 0 must hold. Finally let 
E{Ski, Sk 2 , El, E 2 ,t 2 ,i, j) denote the minimum total weighted tardiness of the 
job set {1, 2, ..., j}U {fci, ^2} in a schedule in which ki,k 2 are the straddling jobs 
with start times Ski and Sk 2 , respectively, the amount of early processing from 
Cm\{ki,k 2 } is Em {m G {1, 2}) and the total processing time of the tardy jobs 
from Ci\{fci, ^2} in interval I 2 is equal to t2,i- 

First we make a few observations which will be useful later. 

Lemma 1. In any optimal schedule the non- straddling tardy jobs scheduled in 
any interval Ii(i > 1) must appear in WSPT order. 
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Proof. Let Ji be the set of non-straddling tardy jobs scheduled in li in an optimal 
schedule. Assume that the jobs in Ji do not follow the WSPT order. Then there 
exist two adjacent jobs j,l € Ji such that Pj/wj < pi/wi, but j is scheduled 
in the position immediately following 1. A simple interchange argument shows 
that switching j in front of I would reduce the total weighted tardiness of the 
schedule, which contradicts its optimality. □ 

The following result is due to McNaughton [7], we include its easy proof here 
for the sake of completeness. 

Lemma 2. The preemptive and non-preemptive versions of any instance of the 
total weighted tardiness problem have the same minimum total weighted tardi- 
ness. 

Proof. Consider any optimal preemptive schedule. Take all but the last piece 
of a job and insert these immediately before its last piece while shifting every 
other job as much to the left as possible. The total weighted tardiness of the 
resulting schedule is clearly not worse than that of the original schedule. Repeat 
this operation for every preempted job. □ 

Let l{j) = max{{0, 1, 2, ..., /c 2 }} for j = 0,1,..., n. We can define the 

following recursive computation for F{Skj, Sk 2 , Ei, E 2 ,t 2 ,i, j), j ^ k\,k 2 . For 
notational convenience we use / as an abbreviation of F{Sk^, Sk 2 , E\, A 2 ,t 2 ,i, j)- 

If dj = Di then 



r F{Sk , , Sk2 , El - pj ,E2,t2,i, l{j - 1)) if - Pj > 0 



/ = min < 



F{Ski,Sk2,Ei,E2,t2,i -Pj,l{j - 1 )) + 

Wj{Sk^ + Pki + t2,i - Di) 

E{Ski I ^k2J El, £^2, ^2,1) l{j — 1)) + 
Wj{Sk2 + Pk2 + ^3,1 + ^3,2 ~ El) 



if Pj < t2,l 



if Pj < t3_i, t3_i -I- ^3^2 < 

P- {Sk2 +Pk2) 



oo 



otherwise 

( 2 ) 



and if dj = D 2 then 



r E{Sk„Sk2,Ei,E2-pj,t2,i,l{j-l)) 



if £2 - Pj >0 



P{Sk„Sk2,Ei,E2,t2,i,l{j-l)) + 
'^ji^k2 + Pk2 + ^3,1 + ts,2 — £ 2 ) 



if Pj < ts2, ^3,1 + ^3,2 < 

P- (Sk2 +Pk2) 



otherwise 



( 3 ) 
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The initial conditions are: 



F{Sk^, El, E2,t2,i,Q) = Wk^max{Q,Sk^ + pk^ - 4il + 

+Wk^max{Q,Sk^ +Pk2 - dk2} 
for all ki,k2 ^ Sk^ , Sk2,Ei,E2, ^2,1 values (4) 



and 



F{Ski, Sk2, El, E2,t2, 1,0) = Wk^max{0,Ski + pk^ - 4i} 

for all ki. Ski, ^k2 ,Ei,E2,t2i values if ^2 = 

(5) 

To explain the above computations, consider first the case when dj = Di : 
The first line of the computation calculates the resulting function value if we 
can insert job j into the interval Ii to make it early, i.e, the time length Ei 
is large enough for this. Since the relative order of early jobs from the same 
job class does not affect the tardiness of the schedule, we can assume that j 
gets inserted at the end of Ei. The next calculation applies when job j is tardy 
and scheduled in l2- Any job that is scheduled to be early in I2 will follow all 
jobs which are scheduled tardy in J2. Note that since the jobs are indexed in 
WSPT order, by Lemma 1 job j should be at the end of ^2,1 and thus the second 
term correctly represents its tardiness. The third calculation corresponds to the 
case when j is tardy and is scheduled in I3. By Lemma 1 again, job j should 
finish at the end of the combined tardy processing time tap + tap. Recall that 
although tap and tap are not state variables, they are uniquely determined by 
the state variables and can be easily derived by equation ( 1 ). Consider now the 
case dj = D2 : The first calculation deals with the case when job j is made early. 
Since the relative order of early jobs from the same job class does not affect the 
tardiness of the schedule, we can assume that j gets inserted at the end of E2- 
Note that although E2 = eip + C2p is the union of these two parts, we don’t 
need to keep track of eip and C2p explicitly. We only need to ensure that E2 
is large enough to schedule job j at its end, possibly preemptively (part of it in 
C2P and the remaining part in Cip.) Thus the insertion of j into E2 may actually 
mean preemptively scheduling it in R U/2, but by Lemma 2 computing the total 
weighted tardiness of a possibly preemptive schedule here is acceptable. Finally, 
the last calculation deals with the case when j can be inserted at the end of the 
combined part tap + tsp to make it tardy. 

The recursion can be implemented as a dynamic program. We have n choices 
for ki, at most n — 1 choices for k2, at most n — 1 choices for j, at most Pmax = 
ma,xi<j<„pj choices for Sk^ and and no more than P choices for each of 
El, E2 and ^2.1- The optimal total weighted tardiness can be obtained by finding 

mm{F(Ski,Sk2, Ei,E2,t2,i,l{n))\ki € { 1 , 2 , ..., n}, fc2 £ { 1 , 2 , ..., n}\fci, 



Ski ^ {El -pki,—,Di - l},max{S'i;i +Pki,D2 -pfcal < Sk2 < minis'll +pki,D2}, 
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-El G {1,2,..., Sfej}, 0 <t2,i < Sk-2 



{Ski + Pki),E2 = Sk2 — t2,l — Pki — El} 



and 



min{F{Ski,Sk 2 ,Ei,E 2 ,t 2 ,i,l{n))\ki G (1, 2, ...,n}, Ski ^ {Ei - Pfci, •••, Ei - 1}, 



Ski + Pki > D2,k2 = $,E\ G {1,2, ■■■,Ski},t2,i = 0, E 2 = Ski ~ El}, 

and accepting the best overall solution from the two sets. The inequalities in 
the last calculations ensure that we consider only feasible combinations of the 
state variables and we consider all of these. Thus we have proved the following 
theorem. 

Theorem 1 . The recursion ( 2 )- ( 5 ) gives a pseudopolynomial algorithm which 
computes the minimum total weighted tardiness for a problem with two distinct 
due dates in tiiTT-e- 

It can be easily seen that the above recursion can be extended to any fixed 
number of distinct due dates. The number of due dates matters only for the 
complexity of the resulting pseudopolynomial algorithm, as the running time of 
the algorithm depends exponentially on the number of distinct due dates, but it 
does not affect the logic of the argument. We state the result without including 
the details in this extended abstract. 

Theorem 2 . There is a pseudopolynomial algorithm with complexity 
which computes the minimum total weighted tardiness for a 
problem with a fixed number k of distinct due dates. 

We state next an immediate consequence of the previous two theorems, which 
sharpens the boundary between polynomially solvable and NP-hard versions of 

Corollary 1 . There is a polynomial algorithm which computes the minimum 
total weighted tardiness for a problem with a fixed number of distinct due dates 
if the job processing times are polynomially bounded. 

3 A Fully Polynomial Approximation Scheme 

Let a* be an optimal sequence minimizing the total weighted tardiness. We use 
T{a*) to denote the minimum total weighted tardiness of a* and Tmiiii{a*) for 
the maximum tardiness of the jobs in a* . We show, as an intermediate step, that 
the complexity of the pseudopolynomial algorithm of the previous section can be 
bounded by a polynomial function of TmaxlCT*). First observe that we can limit 
the dynamic programming recursion to consider only (partial or full) schedules 
whose maximum tardiness does not exceed Tmax(CT*). Accordingly, none of the 
variables ^2,1, ^3,1 or ^3^2 can be larger than TmaxlCT*). By ( 1 ) this implies that we 
need to consider at most T„iax(CT*) + l different values for Ei and E2 for any fixed 
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combination of the other state variables. Finally, notice that the tardiness of a 
straddling job ki will be +Pki ~ Di for z = 1, 2, and thus < Ski 

Ski+Pki~Di < T’max(o-*) imply that can have at most min{pfci, T’max(o'*) + l} 
different values. In summary, the complexity of the pseudopolynomial algorithm 
of Theorem 1 can be upperbounded by 0(n^[T„iax(cr*)]®). 

Similarly to [4], we are going to scale and round down the processing times 
and scale down the due dates by a constant K, which is to be determined later. 
Accordingly, let us define dj = dj jK and = \pj/K\ for j = 1, 2, ..., n. Assume 
that we apply the pseudopolynomial algorithm of the preceding section to this 
scaled down problem and let a a be the optimal sequence found by the algorithm. 
Let be the tardiness of the jth job in this sequence with the scaled down 

data and let be the tardiness of the same job in a a with the original 

data. Then we clearly have for j = l,2,...,n. Furthermore, 

'^o-A = ^ T{a*)/K since a a is optimal for the scaled down 

data. Let denote the total weighted tardiness of the sequence a a when we 
use processing times p' = Kp^ for each job j and the original due dates dj. Note 
that p'j = Kpj < pj < K{pj + 1). If we define T,y^ = YJj=i '^<TA(j)'T<TAU)^ then 
we can write 

n j 

< T{a*) < ^ max{iF + 1) - d„^(^j),0} 

i=i 

^ '^(7 A ^niax Kn{n +l)/2, ( 6 ) 

where iCmax = maxi<j<„Wj. 

Furthermore, 

n 3 

=Kj2waA{j) -d,^^(j),0} 

j=i i=i 

n j 

= max{^ - d-aA{j)M = Ka (7) 

j=i i=i 

Combining (6) and (7), we obtain 

< T{a*) < + w^,^Kn{n + I)/2, 



which implies 



- T{a*) < w^,^Kn{n + l)/2. (8) 



Since we do not need to consider schedules for which would exceed 

2max(CT*) for any j G {l,2,...,n}, the complexity of the dynamic program for 
the scaled problem will be bounded by 0{n^[T^ax{cr*) / K]^) . 
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It is well known that the earliest due date (EDD) order minimizes the max- 
imum tardiness with any number of due dates. Let Tmax be the maximum tar- 
diness and Tedd the total weighted tardiness of this schedule. We can assume 
without the loss of generality that Wj > 1 for all jobs j. Then we have 

Tjnax < ?jnax(CT*) < T{a*) < Tedd < (9) 

Let us assume now that Wmax does not grow too fast with n, i.e., there 
is a polynomial g{n) such that we have rumax < If we choose K = 

£Wmax7max/(ff^('«-) ’ n{n + l)/2), then substituting into inequality (8) and us- 
ing (9) yields 



- T{a*) < g{n)Kn{n + l)/2 < eT„,ax < eT^^a*) < eT{a*)- 

Furthermore, the algorithm’s complexity is upperbounded by 
0{n^[nWmaxTnia.x/K]^) < 0{n^[n^ g'^ (n) / s]^) . Thus we have proved the 

following. 

Theorem 3. If the job weights are bounded by a polynomial in n, then there is 
a fully polynomial time approximation scheme (FPTAS) for the minimum total 
weighted tardiness problem on a single machine with two different due dates. 

It is clear that the above scaling and rounding would also work for any fixed 
number of distinct due dates. Thus based on Theorem 2, we state the following 
result without proof. 

Theorem 4. If the job weights are bounded by a polynomial in n, then there is 
a fully polynomial time approximation scheme (FPTAS) for the minimum total 
weighted tardiness problem on a single machine with any fixed number of distinct 
due dates. 

4 Arbitrary Number of Due Dates 

In this section we examine a general instance of the problem with an arbitrary 
number of due dates. Our goal is to transform the given instance into one with a 
reduced, although not necessarily constant, number of due dates. We then apply 
our previous algorithm whose complexity depends exponentially on the number 
of distinct due dates. 

We are given an instance / with due dates dj, j = 1,... ,n and we will 
produce an instance /' with due dates d), j = 1,... ,n. For a given schedule 

5 let cost{S) denote the total tardiness under the dj and cost'{S) the total 
tardiness under the d). Similarly use Tj, T' to denote the tardiness of job j 
in each case with reference to the same schedule S. Let the original optimum 
OPT refer to the optimal tardiness of instance / under the dj and the modified 
optimum OPT' to the optimal tardiness of /' under the d' . 
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What is a good way to generate d'^l Assume for example that we adopt the 
following strategy: for every job j, enforce d' < dj. Then for a fixed schedule 
S, we have T' > Tj, for all j, and hence cost'{S) > cost{S). Computing S as 
a near-optimal schedule for the d' forces us to shoot for a modified optimum 
OPT' > OPT. When we calculate the cost of S under the original dj it will 
potentially decrease. In order to analyze the performance guarantee we have 
to deal with two opposing effects: (i) upperbound the increase of OPT' with 
respect to OPT and (ii) lowerbound the difference cost' (S) — cost{S) . Symmetric 
considerations apply if we choose to set d' > dj for every j. 

A mixed strategy where for some jobs the due dates increase and for others 
the due dates decrease seems to be more flexible. To counter the opposing effects 
inherent in the analysis, we choose to use randomization: for every job j we 
determine aj, bj such that dj € [aj, bj], and we set d' to aj with some probability 
Xj and to bj with probability 1 — Aj. We proceed with the analysis and will 
determine later suitable aj,bj,Xj values for each job. We emphasize again that 
for time efficiency the resulting number of distinct due dates and hence the 
number of distinct aj , bj values must be small. 

Lemma 3. Under the instance transformation defined above, 

E[OPT'] <opt + J2 - a.)- 

3 

Proof. Consider the schedule So which achieves the original optimum OPT un- 
der the dj. We establish the relation of cost' (So) to OPT. If job j is tardy in So 
with respect to dj, the expectation of T' — Tj is at most Xj{dj — a j). If j is early 
with respect to dj, E[Tj] < Xj{dj — Oj). The lemma follows. □ 

We now examine the effect of calculating the expected cost of a schedule 
under the original due dates dj. 

Lemma 4. Consider any schedule S. Define Bj = —Xj{dj—aj)+{l—Xj){bj—dj). 
If > Oj d = 1) ■ • ■ ) n, then 

E[cost{S)] < E[cost'{S)] + Wj Bj . 

j 

Proof. Let Cj be the completion time of job j in schedule S. If Cj < dj, the 
tardiness Tj is zero. If Cj > dj we estimate E[Tj — T']. Case 1: d' = Oj. This 
event happens with probability Xj and the tardiness under dj decreases with 
respect to Tj by {dj — a j). Case 2: d' = bj. This event happens with probability 
(1 — Xj) and the tardiness increases with respect to T' by at most {bj — dj). It 
follows that for a fixed schedule S, we have E[Tj — Tj] < Bj. Because we assume 
that Bj is nonnegative for all j, including the jobs for which Tj = 0, the lemma 
follows. □ 

Observe that in the upcoming theorem we consider for added generality 
the existence of a non-standard approximation scheme that finds a (1 -I- e)- 
approximation for £ > 0, i.e., we also consider the existence of an exact algo- 
rithm. 
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Theorem 5. Let I' he an instance derived from I based on the transformation 
defined above and let A be an approximation scheme for total weighted tardiness 
with running time T{A,I'a) on instance I' for any £ > 0. If Bj > 0, j = 
1, . . . ,n, we can compute in time T{A,I'a) o, schedule S such that 

E[cost{S)] < (1 + e)OPT + ^ wj {sXj{dj - aj) + {1 - Xj){bj - dj)) . 

3 

Proof. By Lemma 3, E[OPT'] < OPT + WjXj{dj — aj). Invoking algorithm 
A on /' yields a schedule S with cost cost' (S) < (1 + e)OPT', which implies 

E[cost'{S)] < E[{1 + e)OPT'] < {1 + e)OPT + {1 + e)^WjXj{dj - Oj). 

3 

Mapping back the due dates to the original dj values yields by Lemma 4 
E[cost{S)] < E[{1 + e)OPT'] + J2wjBj. 

3 

Substituting the upper bound on the expectation of OPT' yields 

E[cost{S)] < (1 + e){OPT + WjXj{d^ - a^)) + WjBj 

3 3 

E[cost{S)] < (1 + e)OPT + ^ wj {eXj{dj - Oj) + (1 - Xj){bj - dj)) . 



We demonstrate now a way to define the a^’s and the &j’s. We follow the 
method of partitioning the time horizon from 0 to pj in geometrically in- 
creasing intervals whose endpoints are powers of 1 -I- for fixed <5 > 0. Any due 
date that falls on a power of 1 -I- <5 or at the endpoints of the time horizon is 
left unchanged. Otherwise if dj G ((1 -I- (1 -I- define Oj = (1 -I- Sy, 

bj = (1 -I- and denote I by Ij. Observe that for many different j, the Ij 
values may coincide. Let L denote the number of distinct due dates after this 
transformation. Under the assumption that the processing times are bounded by 
a polynomial in n, we can apply the algorithm described in Theorem 2 on the 
transformed instance I' . The running time of the algorithm will be In 

our case L = [logi.!.^ Til +2: therefore we obtain that L = 0(log n/ log(H-(i)) 
under our assumption, i.e., the algorithm will be quasipolynomial. Hence we have 
the following theorem. 

Theorem 6. If the job processing times are bounded by a polynomial in n, then 
for any fixed S > 0, we can compute in quasipolynomial randomized time a 
schedule S such that 

E[cost{S)] < OPT + 6 Wjdj. 

3 
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Proof. Consider the instance I' produced from the original instance by the above 
transformation. We will show at the end that for the chosen Oj, bj values, we can 
choose Xj's so that Bj > 0, j = 1, . . . ,n. Under this assumption and by using 
Theorem 2, Theorem 5 applies with e = 0 and one can compute a schedule S 
such that 



E[cost{S)] < OPT + Y,Wj{l - Aj)((l + Sy^+'^ - dj) 
j 

We now upperbound the additive error factor for job j. 

Wj{l - Aj)((l + (5)'-’+^ - dj) < ^ - Aj)((l + + i5 - 1)) < 

i i 

5 + Sy^ < 6 Wjdj. 

3 3 

The above derivation went through without imposing any constraint on \j. 
Ensuring that Bj = —\j{dj — Oj) + {1 — Xj){hj — dj) > 0, is equivalent to 
Xj < (bj — dj)/(bj — ttj). Since for all j, 0 < Uj < dj < bj it is always possible to 
choose Xj to meet this constraint. □ 
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Abstract. We study an online scheduling problem for unit-length jobs, 
where each job is specified by its release time, deadline, and a nonnegative 
weight. The goal is to maximize the weighted throughput, that is the 
total weight of scheduled jobs. We first give a randomized algorithm 
RMix with competitive ratio of e/(e — 1) « 1.582. Then we consider 
s-bounded instances where the span of each job is at most s. We give 
a 1.25-competitive randomized algorithm for 2-bounded instances, and 
a deterministic algorithm EdFq,, whose competitive ratio on s-bounded 
instances is at most 2 — 2/s -I- o(l/s). For 3-bounded instances its ratio 
is (j> m 1.618, matching the lower bound. 

We also consider 2-uniform instances, where the span of each job is 2. 
We prove a lower bounds for randomized algorithms and deterministic 
memory less algorithms. Finally, we consider the multiprocessor case and 
give an 1/(1 — ( {-competitive algorithm for M processors. We 
also show improved lower bounds for the general and 2-uniform cases. 



1 Introduction 

Network protocols today offer only the ‘best-effort service’, the term — misnomer, 
in fact — that describes the most basic level of service that does not involve firm 
guarantees for packet delivery. Next-generation networks, however, will provide 
support for differentiated services, to meet various quality-of-service (QoS) de- 
mands from the users. In this paper we consider an online buffer management 
problem that arises in such QoS applications. 

In the bounded delay buffer problem [8,1], packets arrive and are buffered at 
network switches. At each integer time step, one packet is sent along the link. 
Each packet is characterized by its QoS value, which can be thought of as a 
benefit gained by forwarding the packet. Network switches can use this value 
to prioritize the packets. Each packet has a deadline that indicates the latest 
time when a packet can be sent. In overload conditions, some packets will not be 
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sent by their deadline, do not contribute to the benefit value, and can as well be 
dropped. The objective is to maximize the total value of the forwarded packets. 

It is easy to see that this buffer management problem is equivalent to the 
following unit-job scheduling problem. We are given a set of n unit-length jobs, 
with each job j specified by a triple (rj,dj,Wj) where rj and dj are integral 
release times and deadlines, and Wj is a non-negative real weight. One job can 
be processed at each integer time. The goal is to compute a schedule for the 
given set of jobs that maximizes the weighted throughput or gain, that is the 
total weight of the jobs completed by their deadline. 

In this paper we focus on the online version of this problem, where each 
job arrives at its release time. At each time step, an online algorithm needs to 
schedule one of the pending jobs, without the knowledge of the jobs released 
later in the future. An online algorithm A is called i?-competitive, if its gain on 
any instance is at least 1 / R times the optimum (offline) gain on this instance. 
The smallest such value R is called the competitive ratio of A. The competitive 
ratio is commonly used as a performance measure for online algorithms, and we 
adopt this measure in this paper. 

For unit jobs, some restrictions on instances have been proposed in the liter- 
ature [8,1,5]. In s-bounded instances, the span of the jobs (defined as the differ- 
ence between the deadline and the release time) is at most s, and in s-uniform 
instances the span of each job is exactly s. In the context of QoS buffer manage- 
ment, these cases correspond to QoS situations in which the end-to-end delay is 
critical and only a small amount of delay is allowed at each node [8] . 

The unit-job scheduling problem is related to another scheduling problem 
which also arises from QoS applications. In metered-task model [2,6], each job is 
specified by four real numbers: release time, deadline, processing time (not neces- 
sarily unit), and weight. Preemptions are allowed. Unlike in classical scheduling, 
even non-completed jobs contribute to the overall gain. Specifically, the gain of 
a job is proportional to the amount of it that was processed. 

Past work. A naive greedy algorithm that always schedules the heaviest job is 
known to be 2-competitive [8,7]. No better algorithm, deterministic or random- 
ized, is known for the general case. For the deterministic case, a lower bound of 
(p « 1.618 was shown in [1,5,7]. In the randomized case, [5] gives a lower bound 
of 1.25. (The proof in [5] was for metered tasks, but it carries over to unit jobs.) 
Both of those lower bounds apply even to 2-bounded instances. 

For the 2-bounded case, a ^-competitive algorithm was presented in [8]. De- 
terministic algorithms for 2-uniform instances were studied by [1], who estab- 
lished a lower bound of -I- 1) « 1.366 and an upper bound of -\/2 « 1.414. 

In [8] , a version of the buffer management problem was studied in which the 
output port has bandwidth M (that is, M packets at a time can be sent). This 
corresponds to the problem of scheduling unit-time jobs on M processors. In [8] 
a lower bound of 4 — 2-\/2 « 1.172, for any M, was presented that applies even to 
the 2-bounded model. For the 2-uniform case, a lower bound of 10/9 was given. 

Our results. First, we give a randomized algorithm with competitive ratio ej (e — 
1) « 1.582, which is the first algorithm for this problem with competitive ratio 
below 2. Our algorithm has been inspired by the techniques developed in [6]. 
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For 2-bounded instances, we give a 1.25-competitive randomized algorithm, 
matching the known lower bound from [5]. 

We also give a deterministic algorithm EdFq, whose competitive ratio on 3- 
bounded instances is (j) = 1.618, matching the lower bound. This result extends 
previous results from the literature for 2-bounded instances [8], and it provides 
evidence that a (/)-competitive deterministic algorithm might be possible for the 
general case. For 4-bounded instances, EdFq, is -\/3 « 1.732 competitive, and for 
s-bounded instances it is 2 — 2/s -|- o(l/s) competitive. However, without the 
restriction on the span, it is only 2-competitive. 

For 2-uniform instances, we prove a lower bound of 4 — 2-\/2 « 1.172 for 
randomized algorithms, improving the 10/9 bound from [8]. In the deterministic 
case, we prove a lower bound of -\/2 « 1.414 on algorithms that are memoryless 
(we say that an algorithm is memoryless if its decision at each step depends only 
on the pending jobs and is invariant under multiplying all weights of pending 
jobs by a constant). This matches the previously known upper bound in [1]. We 
remark that all competitive algorithms for unit-job scheduling in the literature 
are memoryless. 

Finally, we study the M-processor case, that corresponds to the buffer man- 
agement problem in which the output port has bandwidth M , meaning that it 
can send M packets at a time. We give a 1/(1 — ( j^-j-)'“)-competitive algorithm 
for the case of M processors. For randomized algorithms, we also show improved 
lower bounds of 1.25 for the general and 4 — 2-\/2 « 1.172 for the 2-uniform cases. 

In addition to those results, we introduce a new algorithm called Bal^, 
where /3 is a parameter, and we analyze it in several cases. On 2-uniform 
instances, Bal ^^2 is \/2-competitive, matching the bound from [1]. On 2- 
bounded instances, Bal ,3 is /)-competitive (and thus optimal) for two values 
of (3 G {2 — — 1}. It is also (/-competitive for 3-bounded instances. Although 

we can show that Bal^ cannot be (/-competitive in general, we conjecture that 
for some values of (3 its ratio is better than 2. 

Our results show the power of randomization for the problem of scheduling 
unit jobs. For the general version, our randomized algorithm outperforms all 
deterministic algorithms, even on the special case of span at most 2. For span at 
most 2, we give a tight analysis of the randomized case, showing a surprisingly 
low competitive ratio of 1.25, compared to 1.618 in the deterministic case. 

2 Preliminaries 

As we noted in the introduction, the QoS buffer management problem is equiv- 
alent to the unit-job scheduling problem. We will henceforth use scheduling ter- 
minology in this paper. We number the jobs 1, 2, . . . , n. Each job j is specified by 
a triple (rj,dj,Wj), where rj and dj are integral release times and deadlines, and 
Wj is a non-negative real weight. To simplify terminology and notation, we will 
often use the weights of jobs to identify jobs. Thus, we will say “job ic” meaning 
“the job with weight w” . A schedule S specifies which jobs are executed, and 
for each executed job j it specifies an integral time t when it is scheduled, where 
Tj <t < dj. Only one job can be scheduled at any given time step. The through- 
put or gain of a schedule S on instance /, denoted gaing(I), is the total weight 
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of the jobs in / that are executed in S. Similarly, if ^ is a scheduling algorithm, 
gain^(I) is the gain of the schedule computed by A on I. The optimal gain on 
/ is denoted by opt{I). We say that an instance is s-bounded if dj — rj < s for 
all jobs j. Similarly, an instance is s-uniform if dj — Vj = s for all jobs j. The 
difference dj — rj is called the span of a job j. A job i is pending in schedule S 
t ii Ti < t < di and i has not been scheduled before t. 

We often consider offline (canonical) earliest- deadline schedules. In such 
schedules, the job that is scheduled at any time t is chosen (from the ones that 
are executed in the schedule) as the pending job with the earliest deadline. Any 
schedule can easily be converted into an earliest-deadline schedule by rearranging 
its jobs. Jobs with the same deadline are ordered by decreasing weights. (Jobs 
with equal weights are ordered arbitrarily, but consistently by all algorithms.) 

We often view the behavior of an online algorithm A as a game between 
A and an adversary. Both algorithms schedule jobs released by the adversary 
who tries to maximize the ratio opt{I) / gain^(I) . In most of the proofs we give 
a potential function argument by defining a potential function <l> that maps all 
possible configurations into real numbers. At each time step, an online algorithm 
and the adversary execute a job. The proofs are based on the following lemma. 

Lemma 1. Let A he an online algorithm. Let L> he a potential function that is 0 
on configurations with no pending jobs, and at each step satisfies R ■ Again^ > 
Aadv-\- A<1>, where A<P represents the change of the potential, and Again^, Aadv 
represent A’s and the adversary gain in this step. Then A is R- competitive. 

The lemma above applies to randomized algorithms as well. In that case, 
however, Again^ and A<P are the expected values of the corresponding quantities, 
with respect to the algorithm’s random choices at the given step. 

In some proofs we use a different approach called charging. In a charging 
scheme, the weight of each of the jobs in the adversary schedule is charged to a 
job, or several jobs, in our schedule, in such a way that each job in our schedule 
is charged at most R times its weight. If such a charging scheme exists, it implies 
that our algorithm is i?-competitive. 

As discussed in the introduction, our problem is related to the metered-task 
model. Consider the discrete metered-task model, in which jobs have integral re- 
lease times, deadlines and processing lengths, and the algorithm can only switch 
jobs at integral times. (In [5] this model is called non-timesharing.) Then: 

Theorem 1. The unit-job scheduling problem with a single processor is equiva- 
lent to the single processor discrete metered-task model. The unit-job scheduling 
problem with M processors is a special case of the M -processor discrete metered- 
task model (assuming jobs can migrate from one machine to another); they are 
equivalent when, in addition, all jobs in the metered-task model are of unit length. 

The continuous version of the metered-task model [2,6,5] bears some resem- 
blance to the randomized case of unit-job scheduling, although it is not clear 
whether the results from the former model can be automatically translated into 
results for the latter model. One may attempt to convert a deterministic al- 
gorithm T> for metered tasks into a randomized TZ algorithm for unit jobs, by 
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setting the probability of TZ executing a given job j to be equal to V's fraction 
of the processor power devoted to j. It is, however, not clear how to extend this 
into a full specification of an algorithm that would match the performance of T>. 



3 Randomized Algorithm RMix 

In this section we give the first randomized algorithm for scheduling unit jobs 
with competitive ratio smaller than 2. 



Algorithm RMix. At each step, let hi be the heaviest pending job. Select a real 
number x S (0, 1) uniformly at random. Schedule the job j with earliest deadline 
among the pending jobs with Wj > e~^Wh^ . 



Notation. At each step we select inductively a sequence of pending jobs 
h 2 , ■ ■ ■ ,hk so that hi+i is the heaviest job j such that Wj > Wh^je and dj < dhi', 
if such j does not exist, we set k = i. In case of ties, prefer jobs with ear- 
lier deadlines. Denote Vi = Whi for i = l,...,fc and Vk+i = Whife. Let 
Si = In(ui) — ln(rii+i). Note that RMix schedules the job hi with probabil- 
ity Si and ^i^iSi = 1. At a given time step, the expected gain of RMix is 
= J2i=i 



Theorem 2. Algorithm RMix is competitive. 



Proof. At a given time step, let X be the set of pending jobs in RMix, and let 
Y be the set of pending jobs in the adversary schedule that he will schedule in 
the future. We assume that the adversary schedule is canonical earliest-deadline. 

Define the potential = ^i^y-x arrivals and expirations cannot 

increase the potential as these jobs are not inY — X\ the arriving job is always 
in X and the expiring job is never in Y by the definition of Y . So we only need to 
analyze how the potential changes after job execution. By Lemma 1, denoting by 
j the job scheduled by the adversary, it sufficient to prove that Wj + A'P < 

Assume that j GY AX. Inequality In a; < x— 1 for a; = Viyifvi implies that, 
for any i < k, 

Vi-Vi+i < WiQnUj - Inui+i) = S^v^ . (1) 



We have wj < vi as j G X. Let pG{l,...,fc-|-l}be the largest index such that 
Wj < Vp. By the assumption that the adversary schedule is earliest-deadline, we 
know that he will not execute any hi, i = p, ... ,k, in the future, so these are not 
in Y . The expected increase of is then at most 'YTi-i using Vk+i = vije 

and (1), we have wj + < Vp + J^iZi “ ^fc+i) + “ 



y) + EU = sfr Elpiv^ - v.+i) + Er=i - ^*+i) + ECi < 
^ ELp <5*^* + Et EEi s^v^ + x;r=i 

The easy case when j G Y — X is omitted. 
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4 An Optimal Randomized Algorithm for 2-Bounded 
Instances 

In this section we give a 1.25-competitive randomized algorithm for 2-bounded 
instances. This matches the lower bound from [5], and thus completely resolves 
this case. 

Algorithm R2b. Define Pab = 1 if a > 6 and Pab = |f otherwise. Let qab = 1— Pab- 
Let a and b denote the heaviest jobs of span 1 and span 2, respectively, released 
at this time step. If the currently pending job is x, let u = max(a;, a). Execute u 
with probability p„h and b with probability gut- 

Theorem 3. Algorithm R2b is 1.2b -competitive. 

Proof. Without loss of generality, we can assume that at each step exactly one 
job of span 1 is issued. All jobs of span 1 except the heaviest one can be simply 
ignored; if no job is issued, we treat it as a job of weight 0. Similarly, we can 
assume that at each step (except last) exactly one job of span 2 is issued. This 
can be justified as follows: If, at a given time t, the optimum schedule contains 
a job of span two released at t, we can assume that it is the heaviest such job. 
A similar statement holds for Algorithm R2b. Thus all the other jobs of span 2 
can be ignored in this step, and treated as if they are issued with span 1 in the 
following time step. 

First note that Pab satisfies the following properties for any a,b>0. 

5paba > 4a - 6 (2) 5pabU + 2qabb > 4a (4) 

5(pabO + qabb) > 46 (3) 5pabO + 2qabb >b. (5) 

Algorithm R2b is memoryless and randomized, so its state at each step is 
given by a pair {x, s), where x is the job of span 2 issued in the previous step, 
and s is the probability that x was executed in the previous step (i.e., no job is 
pending) . Denote t = 1 — s the probability that x is pending. 

Denoting by z G {0,cc} the pending job of the adversary, the complete 
configuration at this step is described by a triple {x,s,z). Let 'I’xsz denote 
the potential function in the configuration (x,s,z). We put $xsO = 0 and 
^xsx = ■ max(5s - 1, 3s). 

Consider one step, where the configuration is {x, s, z), two jobs a, 6 are issued, 
of span 1 and span 2, respectively. The new configuration is {b,s',z'), where 
s' = sqab + tqx'b, x' = max(a,a;), and z' € {0,6}. Using Lemma 1, we need to 
show that for each adversary move: 

i? • Again^^^ - <Pbs'z' + "^xsz > Aadv (6) 

where Again^^B is the expected weight of a job scheduled by R2 b and Aadv the 
weight of the job scheduled by the adversary. 

Case 1 : Adversary schedules 6. Then <l>xsz > 0, Aadv = b, z' = 0, and ^bs'z' = 0. 
For a fixed value of u in the algorithm, the expected gain of the algorithm is 
PubU-\-qubb and (3) implies \{pubU + qubb) > 6. By averaging over u G {a,x'} we 
get R ■ Agairij^^B A 6, which implies (6). 
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Case 2 : Adversary does not schedule b. Then z' = b, ^bs'z' = j6-max(5s' — 1, 3s'), 
Agaii\ 2 B = s'b + spabCi + tPx'bx' ■ Substituting into (6), it is enough to prove that 

min(6, 2s'b) + bspabCi + 5tpx>bx' + 4 • <Pxsz > 4 • Aadv . (7) 

Case 2.1 : Adversary schedules a. Then Aadv = a < x' and <Pxsz > 0. For the 
first term of the minimum, we use (2) twice and get b + BspabCi + ^tpx'bx' = 
s{b + 5paba) + t{b + 5px'bx') > 4sa + 4tx' > 4a. For the second term of the 
minimum, we use (4) twice and get 2s'b + 5spabCi + ^tpx'bx' = s{5pabO' + ‘^qabb) + 
t{5px'bx' + 2qx'bb) > 4sa + Atx' > 4a 

Case 2.2 : Adversary schedules z = x. li must be the case that x' = x > a, as 
otherwise the adversary would prefer to schedule a. We have Aadv = x. 

If a; > 6, then pxb = 1. We use A<!>xsz = > (5s — l)a; and obtain 

5tpxbX + '^d>xsz > 5te + 5scc — X = 4x, which implies (7). 

It remains to consider the case x < b. Using (2), (5) and (4) we obtain 
b + 5tpxbX > b + t{4x — b) = 4tx + sb and 2s'b + 5spabd + ^tpxbX = s(2qabb + 
5paba) + t{2qxbb + 5pxbx) > sb + 4tx. Together with with 4<Pxsz = 4,<Pxsx > 3sa; 
and X < b this implies min(6, 2s'b) + 5spab<i+5tpxbX+4,^xsz > 4,tx+sb+3sx > 4x 
and (7) follows. 

5 Deterministic Algorithm for s-Bounded Instances 

The 2-bounded (deterministic) case is now well understood. A (/)-competitive 
algorithm was given in [1], matching the lower bound from [8,7]. In this section, 
we extend the upper bound of 4> to 3-bounded instances. For the general case, 
the best known competitive ratio for deterministic algorithm is 2, [8,7]. 

We define two algorithms. They both use a real- valued parameter, a or /?, and 
they are both ^competitive for 3-bounded instances for an appropriate value of 
the parameter. In this section, h always denotes the heaviest pending job. The 
first algorithm schedules a relatively heavy job with the smallest deadline. The 
idea of the second algorithm is to balance the maximum gain in the next step 
against the discounted projected gain in the following steps, if no new jobs are 
issued. A plan (at time t) is an optimal schedule of jobs pending at time t. A 
plan can be computed by iteratively scheduling pending jobs, from heaviest to 
lightest, at their latest available slots. 

Algorithm EdFq,; Execute the earliest-deadline job with weight > awh- 

Algorithm BAL^g.' At each step, execute the job j that maximizes Wj+Pr^j, where 
TTj is the total weight of the plan in the next time step, if j is executed in the 
current step. (In case of a tie, the algorithm chooses the earliest-deadline job, 
and if there are several, the heaviest one among those.) 

We establish the following facts about Bal^. The proofs are omitted. All 
positive results can be shown using Lemma 1. (a) Let (3 & {</> — 1, 2 — (/>}, where 
(f) « 1.618 is the golden ratio. Then Bal^ is (/(-competitive for 2-bounded in- 
stances. (b) Bal ^^2 is -\/2— competitive for 2-uniform instances (this is the 
best ratio for memoryless algorithms, as discussed in Section 6). (c) Bal 0 _i is 
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competitive for 3-bounded instances and is not (/)— competitive for 8-bounded 
instances. 

Theorem 4. EdF 0 _i is (p- competitive for 3-bounded instances. 

Proof. We fix a canonical earliest-deadline adversary schedule A. Let E be the 
schedule computed by Edf^_i. We use the following charging scheme: Let j be 
the job scheduled by the adversary at time t. If j is executed in E before time 
t, charge j to its copy in E. Otherwise, charge j to the job in E scheduled at t. 

Fix some time step t, and let / be the job scheduled in E at time t. Let also 
h be the heaviest pending job in E at time t. By the definition of Edf^_i, / is 
the earliest-deadline job with wj > {(p— l)wh = Wh/<p. Denote also by j the job 
scheduled in A at time t. 

Job / receives at most two charges: one from j and one from itself, if / is 
executed in A at some later time. Ideally, we would like to prove that the sum 
of the charges is at most <pwf. It turns out that in some cases this is not true, 
and, if so, we then show that for the job g scheduled by E in the next step, the 
total of all charges to / and g is at most p{wf Wg). Summing over all such 
groups of one or two jobs, the (/(-competitiveness of EdF 0 _i follows. 

If / receives only one charge, it is at most (pwf. If this charge is from /, it is 
trivially at most Wf. If the charge is from j (not scheduled before t in E), then 
j is pending at t in E and thus wj < Wh < (pWf, by the definition of EdF 0 _i. 
In this case the group consist of a single job and we are done. 

It remains to handle the case when / receives both charges. Since in T job j 
is before /, we have dj < df (and for dj = df, the tie is broken in favor of j) But 
at time t, Edf^_i chooses /, so j is not eligible for execution by EdF 0 _i, that is 
Wj < {(p— l)wh. If Wf = Wh, then / is charged at most Wf-\-Wj < {l-\-(p— l)wh = 
<pwf, and we have a group with a single job again. 

Otherwise, Wf < Wh and the adversary does not schedule / at time t, hence 
df > t 2. By the rule of EdF 0 _i, dh > df. As the span is bounded by 3, 
it has to be the case that dh = t 3 and df = t 2. Thus the adversary 
schedules / at time t -I- 1. The weight of the job g scheduled at time t 1 
in E is Wg > {(p — l)wh, as h ^ f is still pending in E. Furthermore, g gets 
only the charge from itself, as the adversary at time t -I- 1 schedules / which is 
charged to itself. The total weight of the jobs charged to / and g is thus at most 
Wj w f Wg < {<p — l)wh + Wf Wg < |(tc/ J- Wg), stuco both Wh and Wf are 
at least {cp — l)wh. In this case we have a group of two jobs. 

A more careful analysis yields an upper bound of 2 — 0{l/s) on the compet- 
itive ratio of Edf„ on s-bounded instances. 

Theorem 5. For each s > 4, algorithm Edf^/;,^ is Xs~ competitive for s-bounded 
instances, where \g is the unique non-negative solution of equation 

(2 - A.)(A2 + l|ja. + s - 2 - 2L|J) = Af - a. . 

We get A4 = V3 ~ 1.732. For larger s, the equation is cubic. It can be verified 
that 2— j<As<2 — i, and in the limit for s — >■ 00 , Ag = 2 — 2/s -\- o(l/s). 
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Recall that, by Theorem 1, results for discrete metered tasks can be applied 
to unit-job scheduling. Here we describe two such results. We say that a pending 
job i dominates another pending job j if di < dj and Wi > Wj. A pending job 
is dominant if no other pending job dominates it. In [4], the authors considered 
the case of the metered-task model when there are at most s dominant jobs at 
each time, and proposed an online algorithm Gap for this case. In s-bounded 
instances there can be at most s pending dominant jobs at any time. Thus, the 
results from [4] imply that: 

Theorem 6. Gap is Vg- competitive for s-bounded instances, where is the 
unique positive real root of the equation rg = 1 -I- 

We can show that Vg = 2 — 0{^). EdFq, has a smaller competitive ratio 
for s-bounded instances, but Gap can also be applied to the more general set 
of instances that have at most s dominant jobs at any time. Gap can also be 
slightly modified to give the same performance without knowing the value of s in 
advance. In [3], an algorithm Fit was given for the discrete metered-task model. 
Its competitive ratio is better than 2 when the ratio of maximum to minimum 
job weights is at most f. By Theorem 1, we have: 

Theorem 7. Fit is (2 — l/(|'lg^] -|- 2)) -competitive for unit-job scheduling. 

6 2-Uniform Instances 

We first prove a lower bound of 4 — 2-\/2 « 1.172 on the competitive ratio of 
randomized algorithms. This improves a lower bound of 10/9 from [8]. 

Theorem 8. No randomized algorithm can be better than {A — 2^/2) -competitive 
for 2-uniform instances. 

Proof. We use Yao’s minimax principle [9], by showing a distribution on in- 
stances that forces each online algorithm A to have ratio at least 4 — 2-\/2. 

We will generate an instance randomly. Fix a large integer n and let a = 
\/2N 1 and p= 1/a = \[2 — 1. Each instance consists of stages 0, 1, . . ., where in 
stage j we have three jobs: two jobs j,j' of weight issued at time 2j and one 
job j" of weight issued at time 2j -|- 1. After each stage j < n, we continue 
with probability p or stop with probability 1 — p. After stage n, at time 2n -|- 2, 
we issue two jobs of weight a”+^, and stop. 

Fix a deterministic online algorithm A. We compute the expected gain of A 
and the adversary in stage j < n, conditioned on stage j being reached. At time 
2j, A executes a job of weight (it has no choice), say j. If it executes j" at 
time 2j-|-l, its gain in stage j is a-’-|-a^+^ = (U-a)a-’ = (2-\-V2)aP If it executes 
j' , its gain is either 2a^ or 2a^ , depending on whether we stop, or continue 

generating more stages. Thus its expected gain is (1 — p) • (2a-^ -I- a^+^) -|-p- 2a^ = 
(2 + 72 ) 0 ^ same as in the previous case. Since the probability of reaching this 
stage is p> , the contribution of this stage to ^’s expected gain is 2 -|- -\/2. 

We now calculate the adversary gain in stage j. If we stop, the adversary 
gains 2a^ -\- otherwise he gains -\- so his expected gain is (1 — p) • 
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(2a^ + a-’+^) + p ■ {a^ + a^+^) = a^{2 — p + a) = 4a^. Thus the contribution of 
this stage to the adversary’s gain is 4. 

Summarizing, for each step, except the last one, the contributions towards 
the expected value are 2 + \/2 for A and 4 for the adversary independent of j. 
The contributions of stage n + 1 are different, but also constant. So the overall 
ratio will be, in the limit for n — >■ oo, the same as the ratio of the contributions 
of stages 0, . . . , n, which is 4/(2 + -\/2) = 4 — 2-\/2, as claimed. 

Deterministic algorithms for 2-uniform instances were studied in [1] , where an 
upper bound of -\/2 was given. As we show below, it is not possible to beat ratio 
■\/2 with any deterministic memoryless algorithm. We define an online algorithm 
A to be memoryless if its decision at each step depends only on the pending jobs 
and is invariant under multiplying all weights of pending jobs by a constant. Due 
to space constraints the proof of the following theorem is omitted. 

Theorem 9. No deterministic memoryless algorithm can achieve competitive 
ratio better than ^/2 for 2-uniform instances. 

7 The Multiprocessor Case 

The greedy 2-competitive algorithm [8,7] applies to both uniprocessor and mul- 
tiprocessor cases. For M processors we give an algorithm with competitive ratio 
showing that the competitive ratio improves with a larger num- 
ber of processors. When M — >• oo this ratio tends to e/(e — 1) « 1.58, beating 
the 4> « 1.618 bound for M = 1 [5]. The basic idea of our algorithm is sim- 
ilar to algorithm Mixed [5] and our randomized algorithm RMix. We divide 
the processing effort between M processors, such that each processor works on 
the earliest-deadline job with weight above a certain threshold. This threshold 
decreases geometrically for each processor. If no job is above the threshold, we 
select the heaviest remaining job, and reset the threshold to the weight of this 
job. Throughout this section let j3 = M/{M -|- 1), i? = (1 — ■ 

Algorithm DMix-M. Let X be the set of pending jobs at a time t. The algorithm 
chooses jobs hi,. , Hm as shown below and schedules them for execution. 
i i — I5 
repeat 

g <— heaviest job in A — {hi, ..., hi-i} ; hi g ; j i; 
repeat 

l i — % 

f ^ earliest-deadline job in A — [hi, ..., hi-i} with Wf > 
if / exists then hi^ f; 
until / does not exist 

Fix a time step t. Denote Vi = Whi for all i. Normalize the weights so that 
vi = 1. We call those hi selected in the outer repeat loop g-jobs. We only prove 
the case of two 5-jobs, and leave the complete proof to the full paper. Suppose the 
5-jobs are hi and hk, 1 < k < M. By the choices of DMix-M, we have Vi > 
for i G {1, 2, ..., k — 1}, Vk < and Vi > Vkf3'^~^ for i € [k-\-l, ..., M}. 
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Lemma 2. (i) {k - 1) + {M - k + l)vk < R ■ YlfLk *) • 

(a) AipP~^Vk + X)r=i Vi < R - X)i=i Vi for any positive integer p G {k, 

Theorem 10. DMix-M is (1 — -competitive for M processors. 

Proof. (Sketch.) For a given input instance, let I? be a schedule of DMix-M 
and A the adversary schedule. As usual, we assume that A is canonical earliest- 
deadline. Fix a time step t. Let 

H= {hi, ..., Hm}, the jobs executed by DMix-M, at time t, 

J = the set of M jobs executed by the adversary at time t, 

X= the pending jobs of DMix-M at time t, 

Y = the pending jobs of the adversary at time t that will be executed at time t 
or later. 

For a set of jobs I, let w{I) = ^i^i Wi denote the total weight of I. Define the 
potential function (p = w{Y — X). By Lemma 1, it is thus sufficient to show that 
w{J)-G < R-w{H). 

Job arrivals and expirations cannot increase the potential. So we only need 
to analyze how the potential changes after job executions. The change in the 
potential due to the adversary executing the jobs in J is —w{J — AT), as the 
jobs in J — X contribute to the current potential but will not contribute in the 
next step. A job hi executed by DMix-M does not contribute to the potential 
in the current step, but ii hi G Y — J , then, in the next step, hi will be pending 
in A but not in D, so it will contribute to the new potential. Thus the change 
due to DMix-M executing the jobs in H is w{H DY — J). We conclude that 
A<P = —w{J — Jf) + w{H n T — J). Therefore, in order to prove the theorem it 
is sufficient to show that w{J (1 X) w{H (lY — J) < R - w{H). 

Case 1 : H C\Y — J = %. Jobs j G J C\ X must have weight at most 1, at most 
A: — 1 of them can have weights larger than Vk, since otherwise DMix-M would 
choose the 5 -jobs differently. Thus, using Lemma 2, we get: w{J fl Jf) J- w{H fl 

Y - J) < {k-l) + {M -k + l)ufe + 0 < A- • + /?'■") < 

R- + = R-w{H). 

Case 2 : HC\Y — J Let p be the largest index for which hp G Y — J . In other 
words, hp is the highest-indexed job in H that will be executed in A at a later 
time. Since A is earliest-deadline, we have dj < dhp for all j G J. We distinguish 
two subcases. 

Case 2.1 : p > fc. In this case, Wj < (i^~^Vk for any j G JnX — Id, since otherwise 
they would be scheduled instead of hk. Thus by Lemma 2, w(J n X) -h w{H fl 

Y - J) =w{jnx - H) + w{H n r) < MpP-'^Vk + ELi v^<R■ w[h). 

Case 2.2 : p < k. We have Wj < Vk for any j G J n X — {hi, ..., hk}, since 
otherwise they would be scheduled instead of hk. Thus by Lemma 2 with p = k, 
w{JnX) + w{HnY-J)=w{JAX-H)+w{HnY)<j:^^^^_^^^_^^^w,+ 

^ ZVfufe -I- Vt< R- w{H). 
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The lower bound proofs in [5] and Theorem 8 can easily be generalized to 
the multiprocessor case, improving the bounds in [8] (4 — 2-\/2 for the general 
case and 10/9 for the 2-uniform case): 

Theorem 11. No deterministic or randomized algorithm can he better than 5/4- 
competitive, for any number of processors M. No deterministic or randomized 
algorithm can be better than 4 — 2 '/^-competitive, for 2-uniform instances on any 
number of processors M. 
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Abstract. We consider the problem of preemptive scheduling on uni- 
formly related machines. We present a semi-online algorithm which, if 
the optimal makespan is given in advance, produces an optimal sched- 
ule. Using the standard doubling technique, this yields a 4 competitive 
deterministic and e « 2.71 competitive randomized online algorithms. In 
addition, it matches the performance of the previously known algorithms 
for the offline case, with a considerably simpler proof. Finally, we study 
the performance of greedy heuristics for the same problem. 

1 Introduction 

We consider the scheduling problem denoted Q | pm tn|Cmax in the three-field 
notation. We are given m uniformly related machines, each characterized by its 
speed, and a sequence of jobs, each characterized by its processing time. If a 
job with processing time p is assigned to a machine of speed s it requires time 
p/s. Allowing preemption means that any job may be divided into several pieces 
that may be processed on several machines; however, the time slots assigned to 
different pieces need to be disjoint. The goal is to minimize the length of the 
schedule (makespan), i.e., the time when all jobs are finished. 

In the online problem Q\online-list, pmtn\CYmix the jobs arrive in a sequence 
and we have to assign each job without any knowledge of the future requests; the 
algorithm has to determine immediately all the machines and all the time slots 
in which the current job is scheduled. We also consider the semi-online variant 
in which an algorithm is given in advance the value of the optimal makespan. 
(Semi-)online algorithms are evaluated by the competitive ratio, which is the 
worst case ratio of the length of the produced schedule to the minimal length. 

Finally, we also study the performance of two well-known greedy heuristics, 
LIST and LPT. LIST (LIST scheduling) is an online algorithm which schedules 
each coming job so that it finishes as soon as possible. For preemptive scheduling, 
it means that at each time the job is scheduled on the fastest available machine. 
LPT (Largest Processing Time first) uses the same strategy, but the jobs are 
sorted and processed from the largest one; i.e., it is no longer an online algorithm. 

Preemptive scheduling on uniformly related machines is a classical scheduling 
problem, yet it did not receive much attention in the online version. One motiva- 
tion for its study is the expectation that, similarly as for identical machines, the 
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problem should be tractable, as the structure of the optimum is well understood, 
and at the same time it could provide a useful insight for constructing efficient 
randomized algorithms for the non-preemptive version. 

We describe known results and our contribution in each area separately. 

Optimal offline and semi-online algorithms 

For offline preemptive scheduling the optimal solution was given already by 
Horvath et al. [14] and Gonzales and Sahni [13]. The algorithm of Gonzales and 
Sahni is more efficient: First, the total number of preemptions in the schedule 
is 2 (to — 1), which is the best possible bound for schedules with the optimal 
makespan. Second, its running time is 0(n + to log to) and this is also best 
possible: the term to log to is caused by sorting the largest to jobs, which is 
necessary to obtain an optimal schedule. Another algorithm using 2 (to — 1) 
preemptions was given in [15]; it simplifies the algorithm of Gonzales and Sahni, 
but it also sorts the jobs, so it is not semi-online and needs time O(nlogn). 

An optimal ( 1-competitive) semi-online algorithm with the optimal makespan 
known in advance was previously known only for two machines, see Epstein [8]. 

Our results. We give an optimal (1-competitive) semi-online algorithm for the 
studied problem. It generates at most 2 (to — 1) preemptions and runs in time 
0{n + TO log to), thus it is as efficient as the offline algorithm of Gonzales and 
Sahni. In addition it has the advantage of being semi-online, i.e., the jobs can 
be scheduled in an arbitrary order after computing the optimal makespan. 

Since the value of the optimal makespan can be easily computed, our algo- 
rithm can be also used as an efficient offline algorithm instead of the algorithm of 
Gonzales and Sahni. The efficiency is the same and we believe that our algorithm 
is significantly simpler and easier to understand. 

Online algorithms 

For Q I online-list I Cmax, non-preemptive scheduling on uniformly related ma- 
chines, the first constant competitive algorithm was given by Aspnes et al. [1]; it 
is deterministic and its competitive ratio is 8. This was improved by Berman et 
al. [3]; they present a 5.828 competitive deterministic and a 4.311 competitive 
randomized algorithms. For an alternative very nice presentation see [2]. 

These algorithms can also be used for preemptive scheduling. Woeginger [18] 
observed that the optimal non-preemptive makespan is at most twice the optimal 
preemptive makespan for uniformly related machines. Gonsequently, the previous 
algorithms that do not use preemption are also 11.657 competitive deterministic 
and 8.622 competitive randomized algorithms for Q\online-list, pmtn\C^i^x- No 
better preemptive online algorithms were known before for the general case. 

All these algorithms are based on a semi-online algorithm and a doubling 
strategy for guessing the optimal value. This common tool was first used for 
online scheduling in [16,1]. 

The lower bounds for Q\online-list\Cniax are 2.438 for deterministic algo- 
rithms [3] and 2 for randomized algorithms; the same lower bound of 2 works for 
Q\online-list, pmtn\C^ax both for deterministic and randomized algorithms [11]. 

Our results. Using our 1-competitive semi-online algorithm and the same dou- 
bling strategy as in the previous results, we obtain a 4 competitive deterministic 
and e « 2.7183 competitive randomized algorithms for Q\online-list, pmtn\Cmax. 
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Greedy algorithms 

Both LIST and LPT were previously studied for the non-preemptive case, 
Q\online-list\Cmax- The competitive ratio of LIST is not constant, it is asymp- 
totically 6*(logm), see [5,1]. However, sorting the jobs improves the performance 
dramatically. The competitive ratio of LPT is between 1.52 and 1.66 [12]; abetter 
upper bound of 1.58 is claimed in [6], but the proof appears to be incomplete. 

Our results. We show that with preemption, Q\online-list, pmtn\Cm&x, the situ- 
ation is similar. The competitive ratio of LIST is 6?(logm) and the competitive 
ratio of LPT is 2. More precisely, it is between 2 — 2/(m -I- 1) and 2 — 1/m. 

Special cases 

We conclude by a few cases in which we know the exact competitive ratio for 
preemptive scheduling from previous results. 

The first case is that of identical machines (i.e., all the speeds are equal to 1), 
denoted by P\online-list, pmtn\Cmax- Chen et al. [4] gives an optimal determin- 
istic algorithm and a matching lower bound which works even for randomized 
algorithms. The optimal competitive ratio is 4/3 for m = 2 and increases to 
e/(e — 1) « 1.582 as m —>■ oo. 

For the special case of two related machines the optimal competitive ratio for 
preemptive scheduling, Q2jonlme-list, pmtnjCmax, was given independently by 
Wen and Du [17] and Epstein et al. [10] for any combination of speeds. If the ratio 
of the two speeds is s > 1, the optimal competitive ratio is H-s/(s^-l-s-|-l) (this 
is equal to 4/3 for s = 1 and decreases to 1 as s — >■ oo); randomization does not 
help here either. The semi-online deterministic case of Q2 j online-list, pmtnjC'max 
with jobs arriving sorted was completely analyzed in [9]. 

The special case of non-decreasing speed ratios was solved in [7] . Extending 
the technique for identical machines, the exact competitive ratio is given for 
all combinations of speeds satisfying the given restriction; all these values are 
smaller than or equal to 2. 

2 Preliminaries 

Let Mi, i = 1, . . . ,rn, denote the m machines and let Si > 0 be the speed of 
machine Mj. We assume, w.l.o.g., that the machines are sorted so that si > 
S 2 > — The input sequence of jobs is denoted J = where n is the 

number of jobs and pj > 0 is the processing time of jth job. 

Let OPT be the makespan of the optimal schedule. There are two easy lower 
bounds on OPT. First, OPT is bounded by the total work that can be done on 
all machines. Thus 

OPT > (1) 

Second, OPT is bounded by the optimal makespan of any k jobs. An optimal 
schedule of k jobs uses only k fastest machines: if it used a slower machine, some 
faster one would be idle at the same time. Thus, for all k = 1, ... ,m, 

OPT > 

“ W s 



(2) 
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where pj is the jth largest processing time. It is known that the actual value 
of OPT is the minimal value satisfying the conditions (1) and (2) [14,13]. In 
particular, we can compute the value OPT in time 0(n + mlogm). 

3 An Optimal Semi-online Algorithm 

We present the semi-online algorithm, which, given T, generates a schedule with 
makespan at most T, if some such schedule exists. 

The idea of the algorithm is to schedule each job on two adjacent machines 
using one preemption so that it spans over the whole interval [0,T), i.e., it is 
always running on one of the machines. Thus at each time exactly one of these 
machines remains idle. Such a pair of machines can be thought as one virtual 
machine with possibly changing speed. For the subsequent jobs, such virtual 
machines are used in place of real ones. See Fig. 1 for an example. If a job is too 
small, we create a machine with zero speed, to fit this scheme. To prove that this 
outline works, it remains to check that if a job is too long to fit on a machine, 
T is smaller than OPT, as one of the conditions (1) and (2) is violated. 



3.1 Preliminaries 

We assume in this section, w.l.o.g., that pj > 0 for all j; the algorithm can skip 
the jobs with zero processing times. We define machines Mm+i, Mm+ 2 , ■ ■ ■ as 
machines with zero speed. These machines only serve to simplify the description 
of the algorithm as otherwise we would need to analyze separately a case when a 
job is too small to occupy the slowest machine for the whole time interval [0, T). 

We define a virtual machine as a set of adjacent machines, such that exactly 
one of them is idle at any time in [0,T). Let Vi denote the ith virtual machine. 
Scheduling a job on 14 at time t means that we schedule it on the machine in 
Vi that is idle at time t. The speed of virtual machine 14 is denoted Wi(t); it is 
defined to be the speed of the unique machine in Vj which is idle at time t. Let 
Wi = fj' Vi(t)dt be the total work which can be done on 14 . Note that a virtual 
machine is defined so that all this work can be used by a single job. 

3.2 Algorithm InTime 

The algorithm is defined to schedule in the interval [offset, offset + time) instead 
of [0,time). This is used later in the online variants of the algorithm. 

Invariants: The algorithm works with sets 14 and numbers Wi. The following 
properties of the virtual machines are invariants of the algorithm: 

1. Sets 14 are virtual machines. Every real machine belongs to exactly one 
virtual machine. Wi = Vi(t)dt. 

2. For all i and t, Vi{t) > Vij^\{t). This also implies 114 > Wi+i. 

3. Each job that is already processed is scheduled on machines that belong to 
a single virtual machine. For every i, there are exactly |14| — 1 jobs that are 
scheduled on the machines that belong to 14 . 
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Fig. 1. One step of the algorithm InTime. The vertical axis is the time. The width of the 
columns is proportional to the speed of the machines and thns the area is proportional 
to the work done on a given job. (a) Two machines with one job that spans over [0, T). 
(b) The new virtual machine obtained by merging the two idle parts. There is a new 
postponed preemption at time t. 



Algorithm InTime{T) 

— Initialization: 

procedure InitInTime{offset,time) 

T := time; o := offset 

For all i do V, := {Mi}; W, := s, ■ T 

— Step: (schedule a job with processing time p) 
function DoInTime{p) 

1. Find i such that Wi > p > Wi+i or return FALSE 

2. Find t such that /gt'i(T)dr + Ui+i(r)dr = p 

3. Schedule job p on Vj in time interval [o,o + f) and on V^*+i 
in [o + t,o + T). 

4. 14 := 14 U 14+1 ; W, := Wi + W,+i - p; 

For all j > i do Vj := Vj+\; Wj := II 4+1 

5. return TRUE 

— Main body: 

InitInTime{0,T); for j := 0 to n do if not DoInTime{pj) fail 



Theorem 3.1. Algorithm InTime maintains all the invariants and generates a 
schedule with makespan at most T whenever such a schedule exists. 

Proof. All invariants are satisfied after the initialization. Now we show that the 
function DoInTimef) maintains the invariants and that it can fail only in line 1. 

In line 2, t exists because the left-hand side of condition is continuous in t, 
and p lies between values of the left-hand side for 0 and T (i.e., 114+i and 114). 
The value t can be computed since the function Vi{t) changes its value at most 
m times. Line 3 is correct, because virtual machines are idle by definition. The 
real schedule can be generated because the algorithm knows the mappings to 
determine Vi{t). Line 4 merges the two half-used virtual machines to one virtual 
machine that satisfy invariant 1 . Invariant 2 is not broken because the two virtual 
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machines are adjacent. If there are k real machines and k — 1 jobs in Vj and I 
machines and I — 1 jobs in 14 + 1 , we create one virtual machine with k + I real 
machines and fc + / — 1 jobs by scheduling the actual job and merging the two 
virtual machines. Thus invariant 3 is valid as well. 

If DoInTime{) returns FALSE in line 1, then pj > Wi when processing 
some job j (we always have a machine of speed zero). We know that Vi = 
{Ml , . . . , Mfc}, for some k. In case k > m we know that X)/=i Pj' > ^ ‘ 

By (I), T < OPT and thus no schedule exists. In case k < m we know that 
there are k — 1 jobs scheduled on the machines of Vi. So, including j, we have 
k jobs that are together larger than the total work that can be done on the k 
fastest machines before T. By (2), T < OPT and no schedule exists again. □ 

Our algorithm can also be used as an optimal offline algorithm. As noted 
above, the exact preemptive optimum can be computed using conditions (1) 
and (2). Using the computed value as T in InTime(T), the previous theorem 
guarantees that the produced schedule is optimal. 

3.3 Efficiency of the Algorithm 

The number of preemptions. There are two types of preemptions in the 
algorithm. Immediate preemptions are created by dividing a job between two 
virtual machines. Postponed preemptions are generated by scheduling on the 
virtual machine as its speed changes. It is clear that every immediate preemption 
generates at most one postponed preemption. 

Define zero virtual machine as a set of real machines with zero speed. When 
scheduling on a non-zero virtual machine and a zero virtual machine, no imme- 
diate preemption occurs because the job is completed on the non-zero one. On 
the other hand, after scheduling on two non-zero machines, the number of non- 
zero machines decreases. Because we have m non-zero virtual machines after the 
initialization, the algorithm creates at most m — 1 immediate preemptions and 
thus no more than 2m — 2 preemptions overall. 

The time complexity and implementation. Even with a simple implemen- 
tation using linked lists to store both the list of machines and the lists of pre- 
emptions on the machines, the algorithm is quite efficient. If a job is scheduled 
partially on a zero virtual machine, it is processed in time 0(1). The analysis of 
the number of preemptions implies that only m — 1 jobs are scheduled on two 
non-zero virtual machines; each such step takes 0(m) time, including searching 
for the time of the new preemption, actual scheduling and merging the lists of 
remaining preemptions. Thus the total time is 0(n -I- m?). This bound can be 
further improved to 0{n + mlogm) by using search trees instead of linked lists; 
the details are omitted. 



3.4 Generalizations 

It is easy to generalize our algorithm so that the real machines change their 
speeds over time arbitrarily. It is necessary to preprocess the speed profiles so 
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that at each time the machines are sorted according to their speeds; this is easy 
to do using additional preemptions to “glue” the initial virtual machines from 
pieces of different real machines. The same lower bounds (1) and (2) then hold for 
the optimum and the algorithm gives a matching schedule. Naturally, the time 
and the number of preemptions depend on the speed profiles of the machines. 



4 Doubling Online Algorithms 

When the optimal makespan is not known in advance, we guess it and if the guess 
turns out to be too small, we double it. It is well known that this technique can 
be improved by initial random guess with an exponential distribution; it is also 
not hard to optimize the multiplicative constants. The standard proof is omitted. 

Algorithm DoubleQ 

Initialization: G :=pi/si; B := 0; InitInTime{B ,G) 

Step j: while not DoInTime{pj) do B := B + G] G := 2-G; InitInTime{B,G) 

Theorem 4.1. The algorithm Double is a A competitive deterministic online 
algorithm for preemptive scheduling on uniformly related machines. 

Algorithm DoubleRandf) 

Initialization: 

r := rand{[0, 1]); (r is uniformly distributed in [0, 1]) 

G := e” -pi/si; B := 0; InitInTime{B ,G) 

Step j: while not DoInTime{pj) do B := B + G\ G := e-G; InitInTime{B , G) 

Theorem 4.2. The algorithm DoubleRand is an e competitive randomized al- 
gorithm for preemptive scheduling on uniformly related machines. 

5 Greedy Algorithms 

The greedy rule for scheduling on related machines instructs us to schedule each 
coming job so that it ends as soon as possible. With preemptions, this is achieved 
by scheduling the job from time 0 on the fastest idle machine (if there is any), for 
every time t, until it is completed. Thus the first job is scheduled on the fastest 
machine. The second job on the second fastest, with a preemption at the end of 
first job (if it is not completed earlier), and then on the fastest machine, etc. See 
Fig. 2 for an example. This algorithm is called LIST scheduling. If, in addition, 
the jobs arrive ordered so that their sizes are non-increasing, the algorithm is 
called LPT (Largest Processing Time first). 

We prove that LIST and LPT have asymptotically the same competitive ratio 
as in the non-preemptive case. However, note that this is not a straightforward 
consequence of the non-preemptive case, as the preemptive and non-preemptive 
versions of LIST can generate different schedules on the same instance. 

Notation. For a given instance, NOPT denotes a non-preemptive optimal 
schedule and its makespan. It is known that NOPT < 2- OPT. this is proved as 
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Ml M2 M3 M4 



Fig. 2. An example of a schedule generated by the LIST algorithm. Similarly shaded 
regions correspond to the same job. 



an upper bound on the non-preemptive LPT algorithm in [18]; we use the same 
proof in Theorem 5.7. 

For input sequences J and J', J' C J denotes that J' is a subsequence of J 
(not necessarily a contiguous segment). We say that J dominates J' if pj > p' 
for all j. In both cases trivially OPT{J') < OPT{J). 



5.1 Analysis of LIST 

LIST is a simple online algorithm. However, we show that its competitive ratio 
is 6>(logm) and thus it is not very good. Let LIST{J) denote both the schedule 
generated by LIST on the input sequence of jobs J and its makespan. 

We start by the upper bound. First we show that decreasing the size of jobs 
can only improve the schedule. This implies that removing short jobs decreases 
the makespan by a constant multiple of OPT ; doing this in logarithmic number 
of phases we obtain the desired bound. 

Lemma 5.1. Suppose that J dominates J' . Then LIST{J) > LIST(J'). 

Proof. Consider the schedule after scheduling job pj and let tij, i < j, denote 
the ith smallest completion time of a job. Note that the sequence Tj = (tij)j^.^ 
is non-decreasing. Define toj = 0. The job Pj+i is scheduled on machine 
in the interval [tij,ti^ij) and on Mi after tjj; of course this holds only until 
it is completed (it may be too short to reach Mi or even slower machines). The 
corresponding times for J' are denoted by and the sequences Tj. 

We prove by induction on j that for all i, tT < tij, i.e., Tj is pointwise 
smaller than or equal to Tj. The induction assumption for zero jobs is trivial. 
The induction step says that < tij implies < ti,j+i assuming that 

p'j+i<Pj+i. 

By the induction assumption, the job pj+i in LIST(J') is, at every time t, 
processed by a machine that is at least as fast as the machine that processes the 
job Pj in LIST{J) at the same time. Moreover, pj+i < Pj+i- So the job p'j+i 
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must be completed earlier than or at the same time when Pj+i is. The sequence 
is obtained from Tj by inserting the completion time of p'-, while Tj+i 
is obtained from Tj by inserting the completion time of pj. Since a smaller or 
equal number is inserted into a pointwise smaller or equal sorted sequence T', 
it necessarily remains pointwise smaller than or equal to Tj and the inductive 
claim follows. □ 



Lemma 5.2. Let J' C J. Let t he a time such that all jobs j € J \ J' are 
completed at time t in LLST{J). Then LIST{J) < t + LIST(J'). 

Proof. Let J” be obtained from J' by replacing each job pj by p' which is equal to 
the part of pj not processed in LLST{J) by time t. Then the schedule LIST{J") 
is exactly the same as the interval [t, LLST{.J)) of the schedule LIST{J), except 
for the shift by t. By its definition, J" is dominated by J' . Using Lemma 5.1 we 
get LLST{J) = t + LLST{J") < t + LIST(J'). □ 



Lemma 5.3. All jobs j with pj < PmaxInT- completed by time 3 • OPT. 

Proof. First we claim that all jobs are started in LLST by the time OPT. Oth- 
erwise all machines are busy until some time t > OPT, the total work processed 
is ^ > OPT ■ X]™ 1 Sj, a contradiction with (1). 

Second, we claim that all machines with speed s < si/m are idle at and after 
time 2 • OPT. Let / be the indices of the slow machines, I = {i : Si < Smax/'fn}. 
The total capacity of them is small, namely Xie / ^ (jn ~ l)si/m < si < 
thus Si > YlT=i^i- Suppose some machine from I is not 

idle at time 2 • OPT. Then all machines from M \ L are busy for all the time 
from 0 till 2 • OPT and the total size of jobs processed on them is at least 
2 • OPT ■ '^i^j Si > OPT ■ Xi^i Si) which is a contradiction with (1) as well. 

It follows that after time 2 • OPT, each job is scheduled and moreover it is 
scheduled only on machines faster than s\jm. If pj > Pmax/fn, then the job pj 
is completed by time 2 • OPT -|-pj /(si/m) < 2 • OPT + Pmax!si < 3 • OPT. □ 



Lemma 5.4. All the jobs with pj < 2pmin are completed by time 3 • NOPT. 

Proof. Let denote the slowest machine used in a non-preemptive optimal 
schedule NOPT. Then Pmin/sk < NOPT. We also know that is Mk idle in 
LLST at time NOPT, as otherwise LLST would schedule more total work of 
jobs than NOPT. Then the any job with pj < 2pmin is completed by time 
NOPT + Pj/sk < NOPT + 2pmin/sk < 3 • NOPT. □ 



Theorem 5.5. LLST is a {9 + 61og2m) competitive algorithm for preemptive 
scheduling on related machines. 

Proof. Let J be the input sequence, let k = |"log 2 m] . We define job se- 

(■i) (i) 

quences Jk C ••• C C Jg C J;p) and p^m ^^en refer to the process- 
ing times in the sequence Ji. Define Jg as the sequence J without jobs with 
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Pj < Pmax/m. Define J^+i as the sequence Ji without jobs with 
It follows that Jk is an empty sequence and LIST(Jk) = 0. By Lemmas 5.3 
and 5.2, LIST{J) < OPT + LIST{Jo). By Lemmas 5.4 and 5.2, LIST{Ji) < 
3 • NOPT + LIST(Ji^i) < 6 • OPT + LIST(Ji^i). Putting this together, we 
have LIST{J) < 3 • OPT + LIST{.Jq) < 3 • OPT + 6k ■ OPT + LIST{Jk) < 
(9 + 6 log2 to) • OPT. □ 

Now we turn to the lower bound. The instance uses groups of machines 
with geometrically decreasing speeds, but the number of machines increases even 
faster so that the total capacity of slow machines is larger than the capacity of 
the fast machines. 

Theorem 5.6. The competitive ratio of LIST is at least f2(logm). 

Proof. Let us construct the hard instance. Choose integers a, 6, g such that a > 
26 > 4 and g is arbitrary. The set of machines consists of groups Gq, . . . ,Gg, 
where the group Gi consists of a* machines with speed 6“L The jobs are in similar 
groups named Ji, each containing a* jobs of length 6“*. The input sequence is 
a concatenation of these groups starting with the smallest job, that is, J = 
Jg,. .., Jo . We name the phases by the groups of jobs processed in each phase 
(i.e., we start by phase Jg, note that the indices of the groups are decreasing). 

By scheduling each Jk to Gk we get OPT = 1, so it remains to prove that 
LIST > C(logTO). 

For k = 1, ..., g, let ik = number of processors in groups 

Go, ... , Gfc-i. The choice of a guarantees that the number of jobs in Ik is > 
‘2ik. 

To prove a lower bound on LIST{J), we construct a sequence J' dominated 
by J. Each group Jk, for k = g, . . . ,1, is replaced by a corresponding group J^, 
defined inductively below. The last group Jq with a single job is unchanged, so 
the sequence J' is defined as the concatenation of the groups J' , . . . , J{ , Jq. 

To modify the group Jk, consider the schedule LIST{J'g, . . . , J'k+i, Jk)- All 
the jobs in Jk have the same length, thus their completion times are non- 
decreasing. We construct a group J( by shortening the last ik jobs in Jk so 
that their completion times are equal. Denote this common completion time t^. 
For k = 0, the sequence is not modified and tq is the completion time of the 
single job in Jq. Define also Tg+i = 0. 

We prove by induction that, for each k = g, . . . ,0, (i) 1 > Tfc — Tk+i > 12(1) 
and (ii) in the schedule LIST{J'g, . . . , J'k+i, J'k), all the ik processors in groups 
Go, . . . , Gfc_i are busy until time Tk and all the machines are idle after time r^,. 

To start, note that (ii) for k = g + \ holds trivially. Using (ii) for fc + 1, it is 
feasible to schedule all jobs in Jk on machines Gk starting from time Tk+i without 
preemptions and completing at time l + r^+i. The greedy schedule may schedule 
the jobs on faster processors which can only decrease their completion times and 
thus the first inequality of (i) holds. Using (ii) for I > k, it follows that the 
work done on any job in Jk before time Tk+i is at most X)f=fc+i (r/ - Ti+i) 6 ' < 
J2'^k+i /{b — 1). Consequently, all the completion times of jobs in Jk 

are larger than r^+i; thus Tk > Tk+i and (ii) holds for k by the structure of 
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LIST schedule. Since the first ik jobs from Ik are not shortened, the work done 
on the machines in Go, . • . , Gk-i between Tk+i and Tk is at least 

“ (i.-‘ - 



The total capacity of the machines in Go, . . . , Gk-i is X)f=o^ ^ using 

a/b > 2. Thus 



ffc '^k+l ^ 



b—2 u—k^k—1 
b-l^ ^ 

a^b~^ 



(b-2) 

a{b — 1) 



12(1); 



this finishes the proof of (i) and the whole induction. 

Using Lemma 5.1, LIST{J) > LIST(J') = tq = X)fc=o('’’fc ~ '^fc+i) ^ 9 ’ 
12(1) = 12(logm). □ 



5.2 The Analysis of LPT 

LPT is a simple approximation algorithm (no longer online), thus it is interesting 
to know its performance. We show that the approximation ratio of LPT for 
preemptive variant is between 2 — and 2 — ^ . The proof of the upper bound 
is the same as the proof for non-preemptive case from [18]. Non-preemptive LPT 
is there used as an upper bound on NOPT in comparison to OPT. We need to 
analyze the preemptive version of LPT, which possibly gives a different schedule 
than the non-preemptive LPT. Examining the proof shows that the properties 
of the non-preemptive LPT used in [18] are satisfied by the preemptive LPT as 
well. The proof is omitted. 

Theorem 5.7. Preemptive LPT produces schedules with makespan at most 2 — 
^ times (preemptive) OPT. 

An almost matching lower bound is shown by an instance consisting of m 
identical machines and m -|- 1 identical jobs. Assuming unit jobs and machines, 
LPT produces a schedule of makespan 2 (no preemptions are used), while the 
optimal preemptive makespan is . This yields a lower bound of 2 — on 
the competitive ratio. 



Conclusions 

Our main result is an improvement of online algorithms for preemptive schedul- 
ing on uniformly related machines, Q\online-list, pmtn\Cjnax, by a factor of more 
than 2. Still, some gap remains. Our intuition is that, similarly to the case of 
identical machines, Pjonfine-fist, pmtnjGmax) randomization should not help for 
preemptive online scheduling and thus the deterministic 4 competitive algorithm 
can be improved. 

Some proofs omitted in this version appear in the full version of the paper. 
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Abstract. In this paper we study integrated prefetching and caching 
in parallel disk systems. This topic has gained a lot of interest in the 
last years which manifests itself in numerous recent approximation al- 
gorithms. This paper provides the hrst negative result in this area by 
showing that optimizing the stall time is AVX-h&rA. This also implies 
that computing the optimal processing time is AfP-hard, which settles 
an open problem posed by Kimbrel and Karlin. 



1 Introduction 

In modern computer systems, the processor performance has increased at a much 
faster rate than memory access times. Especially disk accesses are very time 
consuming and represent a severe bottleneck in today’s systems. Hence, it is of 
growing importance to find advanced techniques to facilitate efficient interaction 
between fast and slow memory. 

In general, disks are partitioned into pages of a few KBytes in size. When 
some data on the page is needed, the whole page is copied into (fast) RAM 
memory and is accessed from there. Common tools to improve the performance 
of memory hierarchies are prefetching and caching. The cache of a computer 
system consists of a few memory blocks that can hold pages read from the disks. 
Keeping disk pages in cache, one can satisfy multiple requests to the same page. 
This results in an overall increase of the performance of the system. Because in 
general only a small fraction of the disk pages can be held in the cache, it is 
often necessary to evict a page from the cache in order to make room for the 
newly fetched one. 

Caching strategies [4] load a page into the cache upon request. The processor 
then stalls until the page is in cache. Prefetching [5] means to load a page into 
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the cache even before it is actually requested. But prefetching a page too early 
can cause the eviction of a page which is requested shortly. 

Integrated prefetching and caching on single disk systems was introduced by 
Cao et al. in [5]. They developed the first approximation algorithms to tackle 
the problem. Kimbrel and Karlin [11] extended the single disk model to multi- 
ple disks and proposed approximation algorithms for finding a prefetch/caching 
schedule for parallel disk systems with minimum stall time. Both theoretical and 
experimental studies were presented in numerous publications, e.g. [1,2, 3, 5, 6, 7, 
8,9,10,11]. However, determining the complexity for the multiple disk problem 
remained open until now. 



Model Definition Parallel Prefetching and Caching 

In our model, the computer system consists of D disks and one cache which can 
hold k pages. Every disk contains a set of distinct pages. The cache initially holds 
a subset of pages from the disks. Additionally, there is a sequence of requests 
cr = (7i CT 2 ... (Tr- Each of them is denoted by a reference to one page. The task 
is to satisfy all the requests of a in turn. If a requested page is in the cache it 
can be accessed immediately. This takes one time unit. Otherwise it has to be 
fetched, which takes F time units. It is possible to prefetch a page before it is 
requested. At any time, at most one fetch operation per disk can be processed. 
If the cache is completely filled, it is necessary to evict a page in order to fetch a 
new one. While the fetch operation is in progress, neither the incoming nor the 
evicted page is available for access. When a requested page is not in the cache, 
the system stalls until the page arrives. The goal is to find a prefetch/caching 
schedule that minimizes the stall time of the system. 

Approximation algorithms for this problem can be evaluated either by the 
stall time or by the elapsed time. Computing the optimum of these two measures 
is equivalent, but approximating the stall time is more difficult than approxi- 
mating the elapsed time. 



Example 

We consider a system with two disks Di = {A, B}, D 2 = {c, d}, a cache of size 
two holding C = [A,B] initially, and a sequence a = (B,A,c,d,B) of requests. 
It takes F = 2 time to fetch a page from a disk. The stall time for the schedule 
depicted in Figure 1 is two. The first page missing is c. The prefetch operation 
for c starts at time t = 1 evicting page B after its requested. Because Disk D 2 
is busy until t = 3, the prefetch of page d evicting A cannot start before t = 3. 
After processing page c at time t = 4 a prefetch operation is started to refetch 
page B. 

There are two graphical representations for the schedule, both displayed in 
Figure 1. The first one uses the cache state, the second one the operations of 
every disk. We are going to use the disk representation. 
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Fig. 1. Example for two disks. 





Previous Work 

For integrated prefetching and caching on single disk systems, Cao et al. [5] 
introduced two algorithms called Conservative and Aggressive. They also proved 
that on single disk systems, there always exists an optimal schedule which fulfills 
the following four basic rules. 

1. Optimal Prefetch: Always fetch the missing page that will be requested soon- 
est. 

2. Optimal Eviction: Always evict the page in the cache whose next request is 
furthest in the future. 

3. Do No Harm: Never evict a page A to fetch page B when A’s next request 
is before B's next request. 

4. First Opportunity: Never evict page A to fetch page B when the same thing 
could have been done one time unit earlier. 

The first two rules specify which page to prefetch and to evict. The last 
two rules indicate the time at which a prefetch operation should be initiated. 
However, these rules do not uniquely determine a prefetch/caching schedule. 
Nevertheless they provide some guidance how to design algorithms. 

The algorithm Conservative performs exactly the same fetch operations as 
the optimum offline paging algorithm MIN [4] and starts these operations as soon 
as possible. The elapsed time of a Conservative schedule is at most twice the 
elapsed time of an optimum schedule. The Aggressive algorithm starts prefetch 
operations as soon as possible, following the four basic rules. Cao et al. proved 
that the approximation ratio, with respect to the elapsed time, is at most min{l-|- 

2}. Recently, Albers and Biittner [1] improved this ratio to min{l -|- F/(fc -|- 
[pj — 1),2}. They also presented a new family of algorithms called Delay(d), 
which delays the next possible fetch operation for d time units. They could prove 
that Delay (d) combined with Aggressive has a better approximation ratio than 
the two previously known algorithms. Experimental studies on the performance 
of these algorithms were presented in [5] and [10]. 

It was proven in [2] that an optimal prefetching/caching schedule for a se- 
quence of requests on a single disk system can be computed in polynomial time 
using a linear programming formulation. In [3] it was shown that this linear 
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program can be translated into a special multi-commodity flow instance which 
can be solved in polynomial time. 

Kimbrel and Karlin [10] extended the single disk model to multiple disks and 
analyzed Conservative and Aggressive in parallel disks systems. They showed 
that an optimal parallel prefetching/caching schedule for multiple disks does 
obey to three of the four basic rules for the single disk case. Only the Do No Harm 
rule gets violated. They proposed the Reverse Aggressive algorithm and showed 
that it approximates the minimum elapsed time up to a factor of 1 -I- DF/k. 

The linear programming formulation of [2] can be generalized to multiple 
disks and achieves a /^-approximation for the stall time of the system with D—1 
additional pages in cache. Recently, Albers and Biittner [1] gave an algorithm 
for integrated prefetching and caching on multiple disks which achieves the same 
optimal stall time, but uses up to i{D — 1) extra cache positions. 

In [13], Vitter and Shriver describe an alternative model for parallel disk 
systems. For their model, Kallahalla and Varman [8] have shown that an optimal 
schedule can be found in time 0{n\og{k)). 



Notation 

A prefetching instance V is & 4— tuple {V, C, F,a). By V = {I?i, I? 2 , . • . Dm} we 
denote the set of m disks. C represents the cache. F is the time needed to fetch 
a page from a disk to the cache and cr = CTiCT 2 . . . <t„ the sequence of requests to 
pages. We say a page is active at time t if it is in cache or it is being fetched. 
We denote a fetch operation in which page b gets evicted in order to fetch page 
a by (a, 6). 

For a sequence a of requests let at be the t-th request. The gap of a page p 
at at is the number request to pages different from p until the next request to 
p. Hence, the gap of p at at, if the next request to p is a[, is t' — t — 1. Finally, 
we will refer to a schedule with at most £ stall times as an £-schedule. 

2 MinDel2Sat and Monotone MinDel2Sat Are APX-Hard 

In this section we prove that MinDel2Sat is ATTb-hard. This result will be 
used to prove the APX hardness of Monotone MinDel2Sat. In turn, this 
result will be used in the next section to show that minimizing the stall time of 
a prefetching instance is ATTb-hard. 

The MinDel2Sat problem is very similar to Max2Sat. The only difference 
is that the objective is to minimize the number of unsatisfied clauses. In the 
Monotone MinDel2Sat problem, only monotone 2SAT formulas are allowed. 
A 2SAT formula T is called monotone if no clause contains both positive and 
negative literals. 

Lemma 1. MinDel2Sat is AVX -hard. 

Proof. It was shown in [12] that for a Max2Sat instance it is AfP-hard to 
decide whether at least a fraction of || of the clauses, or if at most a fraction 
of of the clauses can be satisfled. From these factors we can conclude that 
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it is A/’P-hard to decide if at least a fraction of || of the clauses, or if at most 
a fraction of of the clauses have to be unsatisfied. It is hard to approximate 
MinDel2Sat within a factor of — e. □ 

Lemma 2. Monotone MinDel2Sat is AVX-hard. 

Proof. The proof is done by a gap preserving reduction from MinDel2Sat: 
For any variable x that appears in the formula T, we introduce a new variable x 
that represents the negative of x. The new formula is obtained by replacing all 
appearances of -'X in IF by a; and adding to T for each variable m copies of the 
two clauses {x V x) and {~^x V ~^x). If m is chosen equal to the number of clauses 
in T , obviously in an optimal solution, all the added clauses will be fulfilled. 
It is easy to see that this reduction is gap-preserving and that Monotone 
MinDel2-SAT is hard to approximate within a factor of — e. □ 



3 Approximation the Stall Time for ParPrefetch Is 
APX-Hard 

We consider the following problem: 

ParPrefetch: Given a prefetching instance V = {V, C, F, a). Find the min- 
imum £ for which there exists an f-schedule for V. 



Theorem 1. ParPrefetch is AVX-hard. Hence, there exists no PTAS unless 
V=MV. 

The proof is done by a gap preserving reduction from Monotone Min- 
Del 2SAT. For a given monotone 2-SAT formula T, we construct an instance X 
of ParPrefetch such that there exists a prefetching schedule for X with stall 
time at most 21 if and only if there exists a truth assignment that leaves at 
most £ clauses unsatisfied. Let IF be a monotone 2-SAT formula composed of n 
variables xi, X 2 , ■ ■ ■ Xn and m monotone clauses Ci,C 2 , ■ ■ ■ Cm- Note that we can 
assume £ < m. 

The instance X will have n 1 disks. For every variable Xi in F, there will 
be a disk Di on which the pages ai, bi, Ci, di, and Ci are stored. These disks 
and pages are called variable disks and variable pages. Additionally, there is a 
disk called P, which contains the page p. The cache has 4n -I- 1 slots. Initially, it 
contains all pages except di, 1 < i < n. 

The request sequence cr is composed of 2m -I- 4 rounds, starting with round 0. 
Every round contains exactly 5(4n -I- 4m) requests. A round is composed of five 
blocks, each having 4n -I- 4m requests. We will often refer to them as xy-blocks, 
with xy € {ea, ab, be, cd, de}. As with the rounds, the first block of cr will be 
called block 0. The fetch time is F = 2(4n -|- 4m). Observe that the fetch time 
F is exactly the length of two blocks. 

For each clause Cj in F, there is one round implemented by a clause gadget 
<JCj- The remaining rounds will be implemented by so-called bridge rounds ub. 
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The basic idea of the reduction is the following. The ParPrefetch instance 
I is chosen such that for each variable disk Di, there are exactly two ways how 
its pages can be loaded into the cache without stalling the system for more than 
units. It all depends on which page of Di is evicted first. We will call these two 
ways policy T and policy F. The policy of a variable disk Di can be interpreted as 
a truth assignment of the corresponding variable Xi in T . In every clause gadget 
(TCjj it is checked whether the chosen policies satisfy the corresponding clause 
Cj in T , incurring two units of stall time if it does not. 

Let us start with the definition oi a b'- 



a;=4m 



eiOi pp 6202 pp--- 


e„a„ pp f. 


© 


aibi pp 0262 pp ■■■ 


anbn PPP- 


. .p © 


61 Cl pp &2C2 PP ■■■ 


bnCn PP P • 


. .p © 


cidi pp C 2 d 2 pp--- 


Cndn PP P . 


. .p © 


dici pp d 2 C 2 pp--- 


ppp. 


• -p 



© denotes a concatenation. Notice that each row of (1) corresponds to a 
block. For space considerations in the figures, we introduce a := 4m. In the 
following, we describe how the sequence A of (2) can be served with stall time 
at most 2^. 



A := (Tb 0 erg 0 • • • 0 erg (2) 

2m+4 

In a 2Aschedule of A, every disk Di is either on policy T or policy F. In both 
policies, each page has exactly four active pages at the end of each block. If a 
page Di is on policy T, then the set of active pages does not change within odd 
rounds. Hence, during odd rounds, no fetch operation that involves a page of Di 
is initiated. On the other hand, in every even block, the initially inactive page 
of Di has to be fetched, while exactly one of the active pages of Di is evicted. 
Remember that the first block of A is block 0, and hence is even. For a disk Di 
on policy F, the fetch operations start in the odd blocks, whereas no operations 
involving pages from Di begin in even blocks. 

Figure 2 describes the two policies by stating which pages are fetched and 
evicted in the corresponding blocks. The nodes in Figure 2 represent the five 
different kinds of blocks. Their circular ordering in A is indicated by the dotted 
arrows. In an xy-block, only the Xi pages can get evicted. The pages which are 
fetched are denoted by the arc leaving the xy-node. The only exception is block 
0. Although it is a ea-block, the page fetched in this case is di. 

The following Figures 3-5 show how A can be served with zero stall time. 
The two policies for each variable disk are indicated. Policy F is the one with the 
dashed arcs. Figure 3 shows the two policies in round 0. Figure 4 and 5 display 
how the policies look like in even and odd rounds of A, respectively. 

The following request sequence a completes our reduction. 



a := 




<JCj © ctb if Cj positive 
o'B (ScrCj if Cj negative 



© ctb © CTb 
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Fig. 2. The fetch/evict cycle of Di. 
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Fig. 3. The two policies per variable in round 0. 
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ai, Cl) 






<C2.e2> 
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Fig. 4. The two policies per variable in even rounds of A and cr. 
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Fig. 5. The two policies per variable in odd rounds of A and a. 
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Also cr consists of 2m+4 rounds. The first two and the last two rounds are as 
rounds. Then for each clause, we put exactly two rounds. For a positive clause, 
we put dCjO-B, where is called a clause gadget for Cj. For a negative clause, 
we put Note that acj is an even round of u if and only if Cj is positive. 

The aCj rounds are defined as follows. 



a:— 4m 



eiOi pp 6202 pp ■ ■ 


* enOn pp 


p- 


■ p 


© 


O'lbl r^Si 02&2 I"2S2 ■ ■ 


dn^ri 


■ 
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6ici Cl Cl 62C2 C2C2 • • 


C-nCji 


C-n ■ 


■ dn 


© 


cidi didi C2<i2 6^2 (i2 ■ ■ 


Cjidn djidji 


P- 


■P 


© 


di6i pp d 2 C 2 pp • • 


* dnen PP 


P- 


■P 





The bold requests denoted by r* and Si depend on the variables in Cj. If 
Xu and Xy (with u < v) are in Cj, we set r„ := and s„ := a„, otherwise 
for all i ^ V, we use Si,ri := bi. Observe that if we chose Si,ri := bi for all i, 
all the 0-schedules of A would also be stall time free for cr. This is easy to see 
since the only difference between aB and ac^ is that a few requests to p have 
been replaced by requests to pages which are in cache in both policies T and F. 
For example within the &c-block of acj, two requests to p got replaced by two 
requests to ci. But since ci is the second request in this 6c-block and the first 
request of the subsequent cd-block, it will be in cache for sure. 



iaippe2a2PpeQOQppp°‘aii agfeasa] 

(ci . ei ) 


L 11 11 1 










( 61 , di) 






(di,ai) (ai,ci) 










(^>2- <^ 2 ) 

(c.3, es) 




(d2,a2) (a2)C2> 










(*^3’ ^ 3 ) 








Pi a* > 



Fig. 6. Clause Gadget for (xiVxa). Some requests are omitted here, since they are easy 
to serve in both policies. The pictnre for {~<xi V ^xs) can be obtained by interchanging 
the two policies. 



Observe that in a, none of the variable pages has a gap larger than 2F. We 
now prove that the two policies described for A also exist for a. That is, in every 
2£-schedule, each variable disk must be in one of the two policies. To this aim, 
we need to prove two simple lemmas. 

Lemma 3. In a 2£-schedule of a, the following holds. At the end of eaeh block, 
page p is active and there are exactly four active pages per variable disk. 

Proof. First consider page p. With the exception of the ac-block and the bc- 
block in a round , the last request of a block is always to p. Hence, p certainly 
is active at the end of such a block. In the two exceptional cases, its gap is at 
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most F — Am < F — 2£. From this we can conclude that p needs to be active 
there as well. 

Concerning the variable pages, note that at the end of each block, the gap 
of all variable pages is at most 2F — Am. Therefore, if a page has less than four 
active pages, we will incur stall time larger than 4m > 2i. Since p is active at 
the end of each block and there are exactly 4n + 1 cache slots, all variable disks 
have exactly four active pages. □ 

Lemma 4. In a 2£-schedule, a page Si from a disk Di with at most four active 
pages can he evicted only if the gap of the non-active page of Di or the gap of Si 
is at least 2F — 2£. 

Proof. If both pages of Di have gap smaller than 2F — 2£, we have to fetch two 
pages within less than 2F time units to prevent more than 2£ stall times. Since 
the two pages are from the same disk, this is not possible. □ 

Lemma 5. In a 2£-schedule, every page is on one of the two policies. 

Proof. We use induction on the blocks in order to prove the lemma. For the 
base case, consider the first two blocks. Until the end of the second block, all 
disks must be fetching their di page. Clearly, p cannot be evicted in the first two 
blocks. Hence, every variable disk has to evict one of its pages in the first two 
rounds. From Lemma 4, we can conclude that there are only two options per 
disk, namely either Ci during the first block or during the second block. 

For the induction step, let us consider block (3 > 2. W.l.o.g. let it be an 
o6-block. Then the block /3 — 2 is a de-block and block /3 -|- 2 is a cd-block. 

Let us first consider the case where a page was fetched from disk Di in block 
(3—2. By the induction hypothesis, its di page was evicted and di was not fetched 
in block (3 — 2 and (3—1. Thus, its gap at the end of block (3 will be smaller than 
F — 2£. Hence page di needs to be fetched in block (3. Because of Lemma 3, a 
page of disk Di must be evicted in block (3. By Lemma 4, we obtain that only 
ai can be evicted. 

Concerning the disks that evicted and fetched a page in block /3 — 1, we now 
prove that they cannot do so in block (3: Let Di be a disk that fetched a page in 
block (3—1. This fetch operation will terminate not before the last 2£ requests of 
block (3. But at this point, all the pages of Di have a gap smaller than 2F — 2£. 
Hence, none of them can be evicted and therefore, because of Lemma 3, Di 
cannot fetch a page either. □ 

Lemma 6. The ParPrefetch instance X can he processed with at most 2£ 
units of stall time if and only if there exists a truth assignment for T that leaves 
at most £ clauses unsatisfied. 

Proof. Since we know that every variable disk must be on one of the two policies, 
there is not much freedom left. Basically, the schedules will look like the one for 
A described in the Figures 3-5. The only difference is within the clause gadgets, 
where p can be evicted. 
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Let us first analyze the behavior of a positive clause gadget Cj = (x„ V Xy) 
(the proofs for negative clauses are very similar). Figure 6 will be used as an 
instructive example. 

If both disks and Dy {u < v) are on policy F, consider acj .just after the 
request in the afeblock. In Figure 6, this corresponds to the first request in 
the dark shaded area. Consider those disks from {Z?i, D 2 , . . . , D„} which are on 
policy F. Let A4 be the set of the pages stored on these disks. 

Until just after r„, the only pages that can have been evicted in this block 
in order to fetch di pages is page p and those in Ad \ {a„, a^}. Hence, there is at 
least one di page from a disk in M which is not fetched by now. Since this page 
has gap at most F — 2, there will be at least two stall times. 

Indeed, there is always a solution that produces two stall times. Namely for 
all z yf u, the operation (di,ai) is executed as soon as possible. Hence, (dy,ay) 
will be two time units late. Concerning the operation {du,p) starts right 
after the first request to Uu in the o6-block. And {p, Uu) starts after the second 
request to a„. In Figure 6, this means that the operation (c?i, oi) gets replaced by 
(di,p). Later, p is refetched by {p, ai) right after dark shaded requests (indicated 
by (p, a*)). Since (da, 03) cannot start before the request to 03 in the dark shaded 
area, we will incur two stall times waiting for d^ to arrive in cache in the next 
cd-block. 

If at most one of D„, Dy is on policy F, by the use of the cache slot containing 
p, we do not incur any stall time during acj ■ Assume that Dy is on policy F. 
Then we evict p just after the first request to in the a&-block in order to fetch 
dy. After the second request to a„ in the o6-block, we can now evict a„ in order 
to refetch p. 

With the above findings, we can now conclude the proof: If there is no truth 
assignment of T such that at most £ clauses remain unsatisfied, there cannot be 
a schedule with less than 2i stall time. This holds since every disk must be in one 
of the two policies and since there are two units of stall time in whenever 
the two disks appearing in the Cj are on the wrong policy. On the other hand, 
a truth assignment with at most £ unsatisfied clauses can be translated in a 
schedule with at most 21 stall times by running disk Di on policy T if and only 
if Xi = true. □ 

With the previous lemmas, the main theorem follows easily. 

Proof of Theorem 1. The instance X can be constructed in polynomial time. 
From Lemma 6, it follows that X has a 2Cschedule if and only if there is a truth 
assignment for T in which t clauses are unsatisfied. Obviously, the reduction is 
gap preserving and therefore the theorem is proven. A simple calculation shows 
an inapproxmability factor of || — £. □ 

4 Open Problems and Acknowledgments 

There are many ways to extend our result. First of all, our result only shows that 
the stall time is hard to approximate. Concerning the elapsed time, we can only 
prove AfP-hardness. An interesting open question is to decide whether there 
exists a PTAS for this variant. 
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If the number of cache slots is bounded, ParPrefetch can be solved in 
polynomial time, since the number of cache states then is polynomial. It would 
be very interesting to know whether polynomial time algorithm exists also for 
bounded number of disks or bounded fetch time. 

We are grateful to Riko Jacob for proof reading and Sebastian Seibert for 
clarifying discussions. 
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Abstract. An instance of the stable marriage problem is an undirected 
bipartite graph G = {X U W, E) with linearly ordered adjacency lists; 
ties are allowed. A matching M is a set of edges no two of which share 
an endpoint. An edge e = (a, 6) G E \ M is a, blocking edge for M if 
a is either unmatched or strictly prefers b to its partner in M, and h is 
either unmatched or strictly prefers a to its partner in M or is indifferent 
between them. A matching is strongly stable if there is no blocking edge 
with respect to it. We give an 0(nm) algorithm for computing strongly 
stable matchings, where n is the number of vertices and m is the number 
of edges. The previous best algorithm had running time O(m^). 

We also study this problem in the hospitals-residents setting, which is 
a many-to-one extension of the above problem. We give an 0{rn{\R\ + 
Pf^)) algorithm for computing a strongly stable matching in the 
hospitals-residents problem, where |i?| is the number of residents and ph 
is the quota of a hospital h. The previous best algorithm had running 
time 0(rn^). 



1 Introduction 

An instance of the stable marriage problem is an undirected bipartite graph 
G = {X U W,E) where the adjacency lists of vertices are linearly ordered with 
ties allowed. As is customary, we call the vertices of the graph men and women, 
respectively.^ Each person seeks to be assigned to a person of the opposite sex 
and his/her preference is given by the ordering of his/her adjacency list. In a’s 
list, if the edges (a, b) and (o, b') are tied, we say that a is indifferent between h 
and b' and if the edge (a, 5) strictly precedes {a,b'), we say that a prefers b to 
b' . We use n for the number of vertices and m for the number of edges. A stable 

* Partially supported by the Future and Emerging Technologies programme of the EU 
under contract number IST-1999-14186 (ALCOM-FT). 

** Work done while the author was at MPII supported by Marie Curie Doctoral Fello- 
wship. 

^ We use X, x' , x” to denote men and w, w' , w” to denote women, and a, a' , b, b' to 
denote persons of either sex. 
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marriage problem is called complete if there are an equal number of men and 
women and G is the complete bipartite graph; thus m = {nj2)‘^. 

A matching M is a set of edges no two of which share an endpoint. If (a, b) G 
M we call b the partner of a and a the partner of b. A matching M is strongly 
stable if there is no edge (a,b) G E \ M (called blocking edge) such that by 
becoming matched with to each other, one of a and b (say, a) is better off and b 
is not worse off. For a being better off means that it is either unmatched in M 
or strictly prefers b to his/her partner in M, and a being not worse off means 
that it is either better off or is indifferent between b and his/her partner in M . 
In other words, a would prefer to match up with b and b would not object to 
the change. 

In this paper, we consider the problem of computing a strongly stable match- 
ing. One of the motivations for this form of stability is the following. Suppose we 
have a matching M and there exists a blocking edge (a, 6). Suppose it is a that 
becomes better off by becoming matched to b. It means that a is willing to take 
some action to improve its situation and as &’s situation would not get worse, it 
might yield to a. If there exists no such edge, then M can be considered to be 
reasonably stable since no two vertices a and b such that (a, 6) in E\M gain by 
changing their present state and getting matched with each other. Observe that 
not every instance of the stable marriage problem has a strongly stable solution. 

There are two more notions of stability in matchings. The matching M is 
said to be weakly stable (or, super strongly stable) if there does not exist a pair 
(a,b) G E \ M such that by becoming matched to each other both a and b are 
better off (respectively, neither of them is worse off). The problem of finding a 
weakly stable matching of maximum size was recently proved to be NP-hard [7]. 
There is a simple O(n^) algorithm [6] to determine if a super strongly stable 
matching exists or not and it computes one, if it exists. 

The stable marriage problem can also be studied in the more general con- 
text of hospitals and residents. This is a many-to-one extension of the classical 
men-women version. An instance of the hospitals-residents problem is again an 
undirected bipartite graph (i? U H, E) with linearly ordered (allowing ties) ad- 
jacency lists. Each resident r G R seeks to be assigned to exactly one hospital, 
and each hospital h G El has a specified number ph of posts, referred to as its 
quota. A matching M is a valid assignment of residents to hospitals, defined 
more formally as a set of edges no two of which share the same resident and at 
most Ph of the edges in M can share the hospital h. 

A blocking edge to a matching is defined similarly as in the case of men- 
women. An edge (a, b) G E\M is a blocking edge to M if a would prefer to match 
up with b and b would not object to the change. A matching is strongly stable 
if there is no blocking edge with respect to it. We also consider the problem of 
computing a strongly stable matching in this setting. Observe that the classical 
stable marriage problem is a special case of this general problem by setting 
Ph = I for all hospitals. 

Our Contributions: In this paper we give an 0{nm) algorithm to determine 
a strongly stable matching for the classical stable marriage problem. We also 
give an 0{m{\R\ +J2heHPh)) algorithm to compute a strongly stable matching 
in the hospitals-residents problem. The previous results for computing strongly 
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stable matchings are as follows. Irving [6] gave an O(n^) algorithm for computing 
strongly stable matchings for men-women in complete instances. In [8] Manlove 
extended the algorithm to incomplete bipartite graphs; the extended algorithm 
has running time 0{rn?). In [4] an 0{m'^) algorithm was given for computing a 
strongly stable matching for the hospitals-residents problem. 

Our new algorithm for computing a strongly stable matching for the classical 
stable marriage problem can be viewed as a specialisation of Irving’s algorithm, 
i.e., every run of our algorithm is a run of his, but not vice versa. We obtain the 
improved running time by introducing the concept of levels. Every vertex has a 
level associated with it, the level of a vertex can change during the algorithm. We 
use the levels of vertices to search for special matchings which are level-maximal 
and this reduces the running time of the algorithm to 0{nm). We also use the 
above ideas in the hospitals-residents problem and obtain an improvement over 

[4]. 

The stable marriage problem has great practical significance [5,9], [1]. The 
classical results in stable marriage (no ties and the lists are complete) are the 
Gale/Shapley theorem and algorithm [2]. Gusfield and Irving [3] covers plenty 
of results obtained in the area of stable matchings. 

Organisation of the paper: In Section 2 we present our 0{nm) algorithm 
for strongly stable matchings for the classical stable marriage problem. In Section 
3 we present our 0{m{\R\ + ^henPh)) algorithm for the hospitals-residents 
problem. 



2 The Algorithm for Strongly Stable Marriage 

We review a variant of Irving’s algorithm [6] in Section 2.1 and then describe 
our modifications in Section 2.2. Figure 1 contains a concise write-up of our 
algorithm. 



2.1 Irving’s Algorithm 

We review a variant of Irving’s algorithm for strongly stable matchings. The 
algorithm proceeds in phases and maintains two graphs G' and Gc, G' and G^ 
are subgraphs of G. Gc is the current graph in which we compute maximum 
matchings and G' is the graph of edges E' not considered relevant yet. In each 
phase, a certain subset of the edges of G' is moved to Gc- Also edges get deleted 
from G' and Gc- We use £i to denote the edges moved in phase i and £<i to 
denote the edges moved in the first i phases. Initially, we have G' = G and 
Ec = 9. 

At the beginning of phase i, Ec Q £<n and we have a maximum matching M 
in Gc- Also, if a man is free with respect to M, then no edges of Ec are incident 
to it. Let £i consist of the top choices^ in E' of each free man. We say, that every 
free man proposes to all women currently at the top of his list. When a woman 

^ Recall that E' O E and that adjacency lists are linearly ordered with ties allowed. 
The top choices for a man x are the set of women tied for first place. 
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receives a proposal from a man x, she deletes all strict successors of x from E' 
and Ec- This may also remove edges in M. 

Observe, that the rules for adding and deleting edges guarantee that if (a, b) G 
Ec and (a, b') G Ec then a is indifferent between b and b' . For a free man x, all 
his top choices are moved to Ec and hence edges in E' go to strictly inferior 
women. A woman keeps only the best proposals made to her and hence edges in 
E' go either to strictly superior men or to men tied with her choices in Ec- 
Next we extend M to & maximum matching in Ec- During this process, 
further edges may be deleted. We iterate over the free men in arbitrary order. 
Let X be any free man. If there is an augmenting path starting at x, we use 
it to increase the cardinality of the matching. Otherwise, let Z be the set of 
men reachable from x by alternating paths and let N{Z) be the set of women 
adjacent to Z in Ec- For each woman w G N{Z) we delete^ all lowest ranked 
edges in Ec U E' incident to it. This is at least one edge {x, w) G Ec and zero or 
more edges (x',w) G E'. 

At the end of the phase, we have a maximum matching in Ec- Also, every 
free man is isolated in Gc since the edges incident to it were removed when we 
searched for an augmenting path starting from it. 

The algorithm terminates when all free men have run out of proposals. Let 
M be the final matching and let Gc be the final graph. Then M is a maximum 
matching in Gc and all free men are isolated in Gc and G". M is a strongly 
stable matching in G if no woman that was ever non-isolated in Gc during the 
execution of the algorithm is free with respect to M. ^ 

We refer the reader to [6,8] for the proof of correctness of this algorithm. 
The algorithm runs in 0{m^) time since the cost summed over all phases is 
0{m ■ (1 + number of successful augmenting path computations)) and since the 
number of augmenting path computations is at most m. The latter claim follows 
from the fact that a matched man becomes free only if the matching edge incident 
to it is deleted. 



2.2 The New Algorithm 

We now show how to modify the algorithm so that it runs in time 0{nm). Our 
method maintains level-maximal matchings and uses level-maximal augmenting 
paths. 

The running time of the algorithm for a strongly stable matching is actually 
the time spent on looking for augmenting paths. The notion of the level of an edge 
and the level of a vertex help us to search for augmenting paths in a streamlined 
manner. The vertices with higher levels are given precedence when searching for 
augmenting paths. When we search for augmenting paths with this precedence 
and we succeed in finding one, then we can show that the level numbers of all 
the edges traversed are at least the level number of the unmatched vertex at the 

® It is here, where we slightly deviate from Irving’s algorithm. We delete edges when- 
ever we identify a free man which cannot be matched. Irving first computes a max- 
imum matching in Ec and then deletes edges. 

For complete instances, it is particularly easy to decide whether the hnal matching 
M is stable. M is stable if it is a perfect matching in G. 
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Set phase number i = 1, E' = E and Ec = 0. 

M = 0 

repeat 

while 3 a free man x do 

move all top choice edges e = (x,w) of a; in to -Ec and delete all edges 
(x', w) from E' U Ec which w ranks strictly after e. 

end while 

Let Si be the edges moved to Ec- 
tor all free men x w.r.t. M do 

if an alternating path from a; to a free woman exists then 

let w be a free woman [of maximal level] reachable from x by an alter- 
nating path and let p be an alternating path from a; to w 
M = M ®p 
else 

let Z be the set of men reachable from x by alternating paths and let 
N{Z) be the women adjacent to them in Ec', 
for all women w € N{Z) do 

delete all lowest ranked edges in Ec U E' incident to w; 

end for 
end if 
end for 

i = i + 1 

until (all free men have run out of proposals) 

declare M strongly stable if every woman that was ever non-isolated in Gc 
during the execution of the algorithm is matched in M. Otherwise, there is no 
strongly stable matching. 



Fig. 1. Two algorithms for strongly stable marriage. The algorithms differ by the phrase 
[of maximal level]. Without the phrase, the algorithm may augment the current match- 
ing along any augmenting path and the running time is O(m^). With the phrase, an 
augmenting path to a woman of maximal level (see Section 2.2) must be used. The 
running time improves to 0{nm). 



end of the augmenting path. This allows us to bound the total number of edges 
traversed in our search for augmenting paths. 

Definition 1. Let Si be the edges added to Gc in phase i and define the level 
fie) of an edge to be the phase when this edge was first added to Gc- Edges never 
added to Gc have no level assigned to them. 

So, the set of edges ever added to Gc consists of the disjoint union SiGS^G 
...Sr, where r is the total number of phases in the algorithm. Note that r can be 
as large as m. 



Definition 2. Define the level fiv) of a vertex v to be the minimum level of the 
edges in Gc incident to v. The level of an isolated vertex is undefined. 
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Definition 3. The level 1{M) of a matching M is the sum of the levels of the 
matched women. A matching M is level-maximal ifl{M) > 1{M') for any match- 
ing M' which matches the same men. 



Lemma 1. For a man all incident edges in Gc have the same level. All women 
adjacent to a man of level i have level at most i. When a woman loses an incident 
edge in she loses all her incident edges in E^. 

Proof. Obvious. 



Lemma 2. A matching M is level-maximal iff there is no alternating path from 
a free woman to a woman of lower level. 

Proof. Observe that the endpoint of the path is a matched woman. Augmenta- 
tion increases the level of the matching and does not change the set of matched 
men. 

For the converse, assume that M is not level-maximal. Let M' be level- 
maximal and matching the same men. Then M © M' is a set of alternating 
paths and cycles. Augmenting a cycle does not change the level sum. Thus there 
must be at least one path whose augmentation to M increases the level sum. 
Since the degree of every man in M © M' is either zero or two, the path must 
connect two women, one free in M and one free in M' . 



Lemma 3. If M is level-maximal, x is a free man with respect to M, w is a 
woman of maximal level reachable from x by an augmenting path and p is an 
augmenting path from x to w, then N = M (B p is level-maximal. 

Proof. Let us look at an alternating path p' from a free woman w' to a matched 
woman w” (all with respect to N). We will show that l{w') < l{w”) and thereby 
by Lemma 2 that N is level-maximal. 

If p' does not contain any edge from p, then p' was an alternating path from 
a free woman w' to a matched woman w" in M. Since M is level-maximal, by 
Lemma 2, l{w') < l{w"). 

Let us then assume, that p' contains some edge(s) from p. 




Fig. 2. The thick edges belong to the matching N 
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Let x' denote the first vertex on the path p' that belongs to p, which we meet 
while traversing p' from w' . Let e! denote the first edge belonging to p (Figure 
2). The vertex x' must be a man, because all edges incident to vertices on p 
and not belonging to p, cannot belong to the matching N and we started the 
traversal of p' from the unmatched woman. So, e! is matched. Let us now look at 
this part of p that has x' at its one end and does not contain e'. It has the man 
X at its other end, that was free in M. Since M (B p = N, the matched edges of 
path p were exactly vice versa before the augmentation, in the sense that those 
edges, that are now present in the matching N, were not present in the matching 
M previously and the other way round. It means that w' was reachable by an 
alternating path from x in M. Thus l(w') < l{w). 

Analogously, let W2 denote the first vertex on the path p" that belongs to p 
which we meet when we traverse p' beginning from the matched woman w" . Let 
e" denote the first edge belonging to p (Figure 2). It is not difficult to notice 
that V02 must be a woman. Now, if we look at that part of p, that has 1x2 at its 
one end and does not contain e" , we will notice that it has the woman w at its 
other end. Thus in M there existed an alternating path from the free woman w 
to the matched woman w" and hence, by Lemma 2, l{w) < 

Combining the observations, we get that l{w') < l{w"). 

Lemma 4 ., M is a level-maximal matching at all times of the execution. 

Proof. We use induction on time. Initially, M is empty and therefore level- 
maximal. For the induction step assume that M is level-maximal at the be- 
ginning of phase i. 

First, every free man proposes to the women at the top of his list. This 
introduces the edge set Si. The level of non-isolated women does not change, the 
level of women previously isolated and not isolated anymore is set to f. M is 
still level-maximal. Assume otherwise, then there must be an alternating path 
from a free woman to a woman of lower level. This path must use one of the new 
edges. The new edges are incident to free men, a contradiction. 

Every woman keeps only her best proposals. For a particular woman w this 
has one of two effects: either she does not drop any incident edge or she keeps 
only edges in Si (not necessarily all of them) . The matching M may be reduced 
in size. Let us use M' to denote the resulting matching. We claim that it is 
level-maximal. Assume otherwise, then there must an alternating path p from a 
free woman to a woman of lower level. It cannot use any of the new edges since 
new edges are incident to free men. Thus p can use only old edges. Also p cannot 
start at a woman of level i since only new edges are incident to such a woman. 
Thus p starts at a woman of level less than i and hence the woman is free with 
respect to M . Since M' C M, p is alternating with respect to M, a contradiction 
to the level-maximality of M . 

Next, we consider the free men in turn and search for augmenting paths. Let 
X be a free man. 

If no augmenting path starting at x exists, let Z be the set of men reachable 
by alternating paths from x and let N{Z) be their neighbours. Then jZj > 
|iV(Z)|. We delete all lowest rank edges incident to the women in N{Z). This 
may decrease the size of the matching. The matching clearly stays level-maximal. 
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If an augmenting path exists, let p be an augmenting path to a woman of 
maximal level. We use p to increase the cardinality of the matching. By Lemma 3, 
the resulting matching is level-maximal. 

2.3 The Search for Augmenting Paths and the Analysis 

We come to the implementation of the search for augmenting paths and the 
analysis. 

Let cc be a free man. We need to determine a maximal level free woman w 
reachable from x and an augmenting path from x to w. Let p be such an aug- 
menting path. Then all women on this path have level at least l{w) by Lemma 2. 
Note that l{w) < l{x). This is because all the women adjacent to x have level 
at most l{x), so if w is adjacent to x, then l{w) < l{x). If w is not adjacent to 
X and if l{w) > l{x), then p contains an alternating path from a free woman of 
higher level (that is, w) to a matched woman of lower level (the neighbour of x). 
This contradicts the level-maximality of the matching. 

We organise the search in rounds l{x), l{x) — 1, l{x) — 2, .... In round j, 
we explore all augmenting paths starting in x and exploring only edges out of 
vertices of level j or larger. We stop in round j when a free woman of level j is 
reached by the search or if the Hungarian tree rooted at x has reached its full 
size. In the former case, j is the maximal level of a woman reachable from x by 
an augmenting path. In the latter case, no free woman is reachable from x. If 
the search has not stopped yet, the frontier of the search consists of women of 
level less than j. In the next round, we continue the search from all women of 
level j — 1 in the frontier. 

In order to find these women, we maintain an array A of buckets (= linear 
lists) which implements a simple priority queue. All buckets are initially empty. 
At the beginning of round j, bucket Bi, I < j contains the women of level I in the 
frontier. We also keep an (unsorted) list of the non-empty buckets and the total 
number of women contained in the buckets. We initialise the bucket structure 
by putting the neighbours of x into the appropriate buckets and setting j to 
l{x). In round j, we continue the search from the women in bucket j. If the 
bucket is empty and the number of unexplored women is positive, we decrease 
j by one. If the bucket is empty and the number of unexplored women is zero, 
we stop. There is no augmenting path starting at x (failure). If the bucket is 
non-empty, let w he & woman in the bucket. We remove w from the bucket. If w 
is free, we stop (success): w is the highest ranked woman reachable from x. If w 
is matched, we explore alternating paths from w (starting with matched edges) 
until a woman of level less than j is reached. These women are then added to 
their appropriate buckets. When the search stops, we empty all buckets using 
the list of non-empty buckets. 

Let j{x) be the minimal bucket index from which we remove a woman. In 
the case of failure this is the minimal level of a woman reachable from x and in 
the case of success this is the maximum level of a woman reachable from x by 
an augmenting path. 

The time for the search from x is proportional to the number k of edges 
explored in the search plus l{x) — j{x) + 1. We charge this cost as follows: 
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In the case of failure we charge one unit each to each edge deleted (this 
accounts for k) and we charge l{x) — j{x) + 1 to the minimum level woman w 
reachable from x. The first kind of charges adds up to m since every edge is 
deleted at most once. The second kind of charge is less than the difference of 
the current level of w and the next level of w. Thus for a single woman the total 
charge of the second kind is bounded by m. We conclude that the total cost of 
unsuccessful searches is 0{nm). 

In the case of success, we charge both costs to w. Observe that all edges 
explored have level at least l{w) (= j{x)) and at most i (= the phase number) 
and that the level of w jumps to at least i + 1 if it ever becomes free again. Thus 
every edge can be assigned at most once to w. Also l{x) — j{x) + 1 is bounded 
by the difference between the current level of w and the next level of w. Thus 
the total charge to w is bounded by m. The total cost of successful searches is 
therefore bounded by 0{nm). 

Theorem 1. Strongly stable matchings for the classical stable marriage problem 
can be computed in time 0{nm). 

Note that the running time of our strongly stable matching algorithm is actu- 
ally 0{\W\m) since the total cost of all unsuccessful searches and augmentations 
is shared by women and the cost charged to a single woman sums to at most 
m over all phases. So, if |IT| <C |A| or |A| <C \W\ (then we reverse the roles of 
men and women and it is free women who propose in every phase and it is men 
who pay for the augmentations), then we can bound the running time of our 
algorithm by 0(min(|A|, |IT|) • m). 



3 Extension to Hospitals-Residents 

Recall that the hospitals-residents problem is a many-to-one extension of the 
classical stable marriage problem. We give an 0{m{\R\ + Ylh^HPh)) algorithm 
for computing a strongly stable matching for the hospitals-residents problem. 
Our algorithm is based on the algorithm in [4] which is an 0{wf) algorithm. 
We obtain the improved running time by restricting again all augmentations to 
result in level-maximal matchings. We give an outline of our approach here and 
the full version of the paper has all the proofs and details. 

3.1 The Algorithm in [4] 

We first review a variant of the algorithm in [4] and then present our modified 
algorithm. The algorithm in [4] generalises the ideas used for computing strongly 
stable matchings in [6] to the hospitals-residents problem. 

As in the case of the stable marriage problem, the algorithm proceeds in 
phases. In any phase, every free resident proposes to all hospitals currently at 
the top of his list and residents become provisionally assigned to hospitals. Each 
hospital h can accommodate up to pt residents, and it needs to keep only the 
best ph proposals made to it but if there is a tie in the last place of its list (called 
the tail), then h can be provisionally assigned to > pu residents. We introduce a 
few terms: 




Strongly Stable Matchings in Time 0(nm) and Extension 231 



~ A hospital is said to be oversubscribed, under-subscribed or fully subscribed 
according as it is provisionally assigned a number of residents greater than, 
less than, or equal to, its quota. 

— A resident r who is provisionally assigned to a hospital h is said to be bound 
to h if ft. is not over-subscribed or r is not in ft’s tail (or both). 

— A resident r is dominated in a hospital ft’s list if ft prefers to r at least ph 
residents who are provisionally assigned to it. 

The algorithm maintains two graph G' and Gc which are subgraphs of G. Gc 
is called the provisional assignment graph with edge set Ec and G' is the graph 
of edges E' not considered yet. During the execution of the algorithm, residents 
become provisionally assigned to hospitals which means that edges are moved 
from G" to Gc- The algorithm proceeds in the same way, as the algorithm for 
strongly stable marriage, by deleting edges e = (r, ft) which cannot belong to 
any strongly stable matching. 

Reduced Assignment Graph: We maintain a graph Gr C Gc, called the 
reduced assignment graph. The residents who appear in Gr are those that are not 
bound to any hospital (we call such residents unbound). So, for any hospital ft, 
the edges incident to ft in Gr are to the unbound residents, and hence are at the 
tail of ft’s list. Each hospital ft has a reduced quota in the reduced assignment 
graph, which is the difference between the original quota ph and the number of 
residents bound to ft. So, the vertices of Gr are the set of unbound residents 

and the set of hospitals which are the neighbours of the unbound residents. The 

(i) 

reduced assignment graph of phase i is denoted as Gr . 

Now the algorithm is very similar to the algorithm for strongly stable mar- 
riage, except that we compute maximum matchings in the reduced assignment 
graph. Initially, G' = G] Ec = 0; all the residents are free and is the empty 
graph. At the beginning of phase i, we have a maximum matching M in Gr* If 
a resident is free with respect to M, then he is isolated in Gr*~^^ Then we move 
the edges corresponding to the top most choices of every free resident from E' to 
Ec- This denotes free residents being provisionally assigned to hospitals. When- 
ever a hospital ft becomes fully or over-subscribed, then we delete all edges (r, ft), 
where r is dominated on ft’s list, from G' and Gc- The reduced assignment graph 
Gr ^ is computed from Gr~^\ Observe that an edge (r, ft) can change state from 
bound (r is bound to ft) to unbound (r is not bound to ft) but not vice-versa. If 
a new edge that gets added to Gc corresponding to one of the top choices of a 
free resident in Gr~^^ is a bound edge, then it could cause some bound edges to 
become unbound or it could cause some edges to get deleted. Any edge of Gr 
that is not deleted from Gc continues to remain in Gr*^ The change of state of 
an edge (r, ft) from bound to unbound need not make the resident r unbound 
unless (r, ft) was the only bound edge incident to r and now (r, ft) has changed 
state to unbound. Then r, which was not present in Gr , starts appearing in 
Gr*^ Then we extend M in Gr*^ to match all the unmatched residents. 

Augmenting path: In the hospitals-residents setting, a hospital ft is con- 
sidered free in Gr if it is not matched up to its reduced quota p),. An alternating 
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path from a free resident to a hospital that is not filled up to its quota is con- 
sidered an augmenting path. 

We iterate over the free residents in arbitrary order. Let r be any free resident. 
If there is an augmenting path starting at r, we use it to increase the cardinality 
of the matching. Otherwise, let Z be the set of residents reachable from r by 
alternating paths and let N{Z) be the set of hospitals adjacent to Z in E^- For 
each hospital h € N{Z) we delete all lowest ranked edges in Ec U E' incident to 
it. 

At the end of the phase, we have a maximum matching M in . Also, every 
free resident is isolated in Gc since the edges incident to it were removed when 
we searched for an augmenting path starting from it. When all free residents 
have run out of proposals, we need to find a feasible matching M' in Gc which 
contains the maximum matching M in G^ and matches every bound resident r to 
a hospital that r is bound to. M' is a strongly stable matching if a hospital that 
was fully or over-subscribed at some point in the execution of the algorithm is 
fully matched in M' or a hospital that was always under-subscribed has assignees 
in M' equal to its degree in Gc- We refer the reader to [4] for the proof of 
correctness of this algorithm. 

3.2 Our Modifications 

Let us extend our definitions in order to capture the somehow different structure 
of the hospitals-residents problem. 

Definition 4. Define the level of an edge e, l{e), to be the phase that e is added 
to the reduced assignment graph Gr- ^ 

Definition 5. Define the level of a vertex v, l{v), to be the minimum level of 
the edges incident to v. If v does not belong to Gr, its level is undefined. 

Definition 6. Define the level of a matching M, 1{M), to be the sum over all 
hospitals of the level of a hospital multiplied by the number of edges that this 
hospital is matched with. 

Definition 7. A matching M is level-maximal if 1{M) > l(M') for any match- 
ing M' which matches the same residents. 

The following lemmas show how to maintain a level maximal matching. The 
proofs are available in the full version of this paper. 

Lemma 5. A matching M is level-maximal iff there is no alternating path start- 
ing with an unmatched edge from a free hospital to a hospital of lower level. 

Lemma 6. If M is level-maximal, r is a free resident with respect to M , h is 
a hospital of maximal level reachable from r by an augmenting path and p is an 
augmenting path from r to h, then N = M (Bp is level-maximal. 

Lemma 7. M is a level-maximal matching at all times of the execution. 

® Note that an edge appears in Gr at some phase which might not necessarily be the 
phase that this edge appeared in Gc. 
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3.3 The Running Time 

The search for augmenting paths in Gr is implemented as in the classical sta- 
ble marriage problem. Using similar arguments, one can see that the cost of 
unsuccessful searches is 0{m) and of successful searches is 0{{J2heH Ph)''^)- 
Furthermore, with an appropriate representation of the graphs, all changes of 
Gr can be done in time 0{\R\m). 

Theorem 2. Strongly stable matchings for the hospitals-residents problem can 
be computed in time 0{m{\R\ + Ph)) ■ 

We conclude that in the worst case J2heHPh be as large as m, in which 
case we get a running time of 0{mf), but in any practical application, we expect 
that '^h(znPh = |-R|) in which case we get a total running time 0{\R\m). 
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Abstract. We study the problem of embedding arbitrary finite metrics 
into a line metric in a non-contracting fashion to approximate the min- 
imum average distortion. Since a path metric (or a line metric) is quite 
restricted, these embeddings could have high average distortions (f?(n), 
where n is the number of points in the original metric) . Furthermore, we 
prove that finding best embedding of even a tree metric into a line to 
minimize average distortion is NP-hard. Hence, we focus on approximat- 
ing the best possible embedding for given input metric. 

We give a constant-factor approximation for the problem of embedding 
general metrics into the line metric. For the case of the metrics which 
can be represented as trees, we provide improved approximation ratios 
in polynomial time as well as a QPTAS (Quasi-Polynomial Time Ap- 
proximation Scheme). 



1 Introduction 

Metric embeddings have recently attracted much attention in theoretical com- 
puter science because of their many algorithmic applications. These range from 
simplifying the structure of the input data for approximation and online prob- 
lems [5,8,9,15,18,24], serving as a well-roundable relaxation of important NP- 
hard problems [7,11,12,13,17,27] or simply by being the object of study [1,16] 
arising from applications such as computational biology. Embedding techniques 
have thus become an indispensable addition to the algorithms toolbox, provid- 
ing powerful and elegant solutions to many algorithmic problems (see, e.g., [29, 
Chapter 15] and [22]). 

An embedding of a metric (U, d) into a “simpler” host metric {H, S) is a 
map f : V ^ H; the embedding is a good one if the distances between 
points in d closely resemble those between their images in S. An embedding 
is called non-contracting if the map does not decrease any distances, i.e., 
d{x, y) < 6{f{x),f{y)y for all x,y GV. We restrict ourselves to non-contracting 

* Supported by NSF ITR grants CCR-0085982 and CCR-0122581. 

** Supported in part by NSF grant CCR-0105548 and ITR grant CCR-0122581. 

^ In the sequel, we will abbreviate S{f{x),f{y)) to S{x,y). 
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embeddings in this paper. Perhaps the most popular and useful measure of the 
quality of an embedding / is the distortion a = a{f), which is: 

distortion a = mdiSLx.y^v 

A closely related measure is that of average distortion, which is 

6(x,y) 

average distortion p{f) = d(x i ' 

2—lx,v^V 

While many embedding techniques and algorithms are known, the analyses 
for these embeddings usually only offer uniform bounds on the distortion of the 
embeddings; few results which approximate the distortion of the embeddings to 
better than these uniform bounds. This is best shown by a concrete example: 
Matousek [28] proved that any metric {V, d) can be embedded into the real line 
with distortion 0(|P|); furthermore, the result is existentially tight, as the n- 
cycle cannot be embedded into the line with distortion o(|P|) (see, e.g., [31,21]). 
However, no algorithm is known which offers per-instance guarantees; hence, 
while it may be possible to embed {X,d) into IR with distortion a = 0(1), 
no algorithms are known which give us embeddings with distortion, say, that is 
within 0{\V\^~^) times p\ No results are known even when we replace distortion a 
by average distortion p as the measure of goodness^. 

1.1 Our Results 

In this paper, we prove results for approximating the average distortion when em- 
bedding metrics into the line IR (while ensuring that the map is non-contracting 
). We can think of embeddings into a line as defining a tour on the nodes of 
the original metric. Note that for an embedding to be non-contracting, it is 
necessary and sufficient to have the distance between adjacent pair of vertices 
in the tour to be the same as their distance in the input metric. Our results 
demonstrate a close relationship between minimizing average distortion and the 
problems of finding short TSP tours [25], minimum latency tours [10,20,4], and 
optimal fc-repairmen solutions [14]. In particular, we prove the following results. 

— Hardness for average distortion: We prove that the problem of finding 
a minimum average distortion non-contracting embedding of finite metrics 
into the line is NP-hard, even when the input metric is a tree metric. This 
is proved via a reduction from the Minimum Latency Problem on trees [33] . 

— Constant-factor approximations: We give an algorithm that embeds any 
metric (V, d) into the line with average distortion that is within a constant 
of the minimum possible over all non-contracting embeddings. In fact, we 
prove a slightly more general bound on non-contracting embeddings into k- 
spiders (i.e., homeomorphs of stars with k leaves). This result uses a lower 

^ One notable exception is the remark of Linial et al. [27] that the optimal embedding 
of any finite metric into (unbounded dimensional) Euclidean spaces to minimize 
distortion can be computed as a solution to a semi-definite program. 
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bound on the minimum average distortion of a non-contracting embedding 
into a fc-spider in terms of the minimum fc-repairmen tour [14] on the metric. 
We also show a tightened result for the case of 2-spiders using ideas from 
constructing minimum latency tours [20]. 

QPTAS on trees: For tree metrics on n nodes, we give an algorithm for 
finding a (1 -|- e)-approximation to the minimum average distortion non- 
contracting embedding into a line in ^ time. Our algorithm uses 

a lower bound on the minimum average distortion related to the TSP tour 
length and latencies of appropriately chosen segments of an optimal tour. 
In this way, it extends the ideas of Arora & Karakostas [6] for minimizing 
latency on trees to the more general time-dependent TSPs [10] to provide a 
QPTAS for the latter problem as well. 

Given a tree metric as input, if the minimum average distortion is measured 
only over the endpoints of the edges of the tree (we call this objective the average 
tree-edge distortion), we can prove that an embedding following an Euler tour 
of the tree is optimal. This tour can be found in polynomial time by dynamic 
programming. We omit the description of this algorithm due to lack of space. 



1.2 Related Work 

The definition of average distortion is by no means new; e.g., Alon et al. [2] study 
the question of embedding a metric into a tree with low average distortion. In 
recent work on average distortion that is closer to our work, Rabinovich [30] 
proves bounds on average distortion of non-expanding embeddings into a line 
and shows the close connection between this and the max-flow min-cut ratio for 
concurrent multicommodity flow with applications to finding quotient cuts in 
graphs [26]. 

Our problem is similar to that of finding the Minimum Linear Arrangement 
(MLA), for which Rao & Richa [32] gave an O(logn) approximation using the 
notion of spreading metrics. However, while the MLA problem involves mini- 
mizing the average stretch of the edges X){u v}^e ~ under all maps 
TT : V ^ [n], the mappings in our problem are / : P — >■ IR, and must ensure that 
|/(m) - f{v)\ > d{u,v) V{u,u} G P X P. 

The problem of finding Minimum Latency tours (a.k.a. the traveling repair- 
man problem) is most relevant to our discussion in terms of techniques used. 
This problem requires a repairman who starts from a depot on a given finite 
metric to visit n customers, one at each node of the metric; his goal is to min- 
imize the average waiting time or latency of the customers, where the waiting 
time of a customer is the sum of the distances of all edges traversed by the re- 
pairman before visiting this customer. The version of this problem with only one 
repairman (also called the Minimum Latency Problem) is known to be NP-hard 
even on trees [33] and MAX-SNP hard in general [10]. The first constant-factor 
approximation for this problem was given by Blum et oZ. [10], which was subse- 
quently improved by Goemans and Kleinberg [20] to the currently best-known 
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bound of 7.18. Recently, Archer, Levin and Williamson [4,3] gave faster algo- 
rithms obtaining very similar approximation guarantees. For the special cases 
of the latency problem on trees, and in IR'^ for fixed dimension d, Arora and 
Karakostas [6] gave quasi-polynomial time approximation schemes (QPTAS). 
The extension of the latency problem to more than one repairmen was recently 
studied in [14] where the authors show a 16.994-approximation for the general 
fc-repairman case. 

Finally, a problem that generalizes both the cost of a tour as well as its latency 
into one objective is that of finding time dependent TSP tours. A constant factor 
approximation algorithm is also known for this problem [10]. 

Outline: The rest of the paper is organized as follows. In Section 2, we argue that 
the embedding problem is NP-hard, and give the constant-factor approximation 
algorithm for embedding metrics into the line with constant average distortion. 
Section 3 shows the QPTAS for the case of trees metrics as inputs. 

2 Embedding Arbitrary Metrics into the Line 

In this section, we show that we can approximate the average distortion into 
a line for a given metric to within a constant; to this end, we show that the 
problem is closely related to that of finding the minimum latency tours and 
its generalizations in a finite metric space. We omit the proof of the following 
theorem; the reduction is from Minimum Latency on trees. 

Theorem 1. It is NP-hard to find a non-contracting embedding of a given met- 
ric induced by a tree into a line that minimizes the average distortion. 

First, we show a simple 2-approximation for embedding a finite metric into 
a special kind of tree metric, namely a k-spider. (A fc-spider is a tree with all 
vertices except the center having degrees 1 or 2, and hence is a homeomorph of 
the star with k leaves). The case of a n-spider or a complete star is more natural 
to argue about, while the 2-spider is a path giving our main result. 

Embeddings into trees. Consider the problem of embedding the given met- 
ric d into a tree metric 6 to minimize average distortion. Let A = y&v v) 
denote the sum of all the distances in the metric d, and hence av(d) = Z\/n^ is 
the average distance in d. The median of the metric d is the point v G V that 
minimizes Ay = d{v,w), and will be denoted by med. Note that we can 

decompose A as follows: 

^ d{u, v) = X)«ey(X]j)Gy d{u, v)) = ^ ^med (1) 

since Limed < Ay for all v gV . Consider a shortest-path tree T (which is a star 
in a general metric d) rooted at med, and let dr denote the metric induced by 
this shortest path tree. Then the total distance in this tree T is 

AT = n^ ■ av{dT) = J2u,vev driu, v) < J2u,vev drived, u) -G drimed, v) 

= Xu.vey c^(med, u) -\- d(med, v) = 2nAmed 
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where the inequality in the second step is just the triangle inequality. This implies 
that nAmed ^ A < At < 2nAmedi and thus: 

Lemma 1 (See also [34]). Given any graph, the average distance At for the 
tree rooted at the median is at most 2 A, and is a 2 -approximation for the problem 
of embedding the graph into trees. 

Note here that the bound of 2 above is an absolute bound on the worst-case 
ratio between the average distance in the output tree and the graph, and is in the 
same flavor as the more traditional results on bounding the maximum distortion 
of embeddings. We next move toward an approximation approach by restricting 
the class of trees into which we embed. 



Embeddings into spiders. We now generalize the previous result to the case 
of embeddings into /c-spiders. The vertex of degree k is called the center of 
the spider, and the components obtained by removing the center are called its 
legs [23]. 

Let d*f, denote the optimal /c-spider embedding. We decompose the sum of 
distances in as the sum of /c-repairman paths rooted at each vertex. Recall 
that, in fc-traveling repairman problem, we are given k repairmen starting at a 
common depot s. The k repairmen are to visit n customers sitting one per node of 
the input metric space. The goal is to And tours on which to send the repairmen 
so as to minimize the total time customers have to wait for a repairman to 
arrive [14]. 

Let c be the center of the spider in the optimal /c-spider embedding. To 
construct a /c-repairman paths starting from a vertex r, we do the following. 
We send one repairman away from the center along the leg of the spider which 
contains r. The other k — 1 repairmen travel toward the center c of the spider. 
From the center, they go off, one per remaining leg of the spider. The cost of 
this fc-repairman tour is A* = d*f.{r,j). Summing over all choices of the root 

we see that this is same as the sum of distances in the embedding dj. 

Y.v(^v^*v = 

Hence, n times the cost of the cheapest fc-repairman tour over all choices of 
the depots (denoted by A°p*), is a lower bound on the sum of all the distances, 
i.e., 

d*k{u, v)>n- Toinr{A°P*}. 

Consider the cheapest /c-repairman tour over all choices of centers. Let it 
be centered at a vertex c. This tour defines a non-contracting embedding into 
a fc-spider with c at the center of the spider. Let d°(rt) denote the distance of 
vertex u from the center c in the tour. We can bound the sum of distances in 
this embedding as follows: 



E 



u,vevd%{u,v) < Y.u,vev d%u) + d%v) < 2nY^^^y d'^{u) <2Yf 



u,vevd*k{u,v). 



Thus, if we could compute the optimal fc-repairman tour centered at c exactly, 
we would obtain a 2-approximation to the problem of embedding the metric 
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into fc-spiders. Although the problem of finding an optimal fc-repairman tour is 
NP-hard, the argument above proves the following. 

Theorem 2. Given a ^-approximation for the minimum k-repairmen problem 
on a metric d, we can obtain a 2^ -approximation for embedding the metric d 
into a k-spider in a non- contracting fashion to minimize the average distortion. 

The current best known approximation factor for the fc-repairman problem is 
about 17 (due to Fakcharoenphol et al. [14]), leading to the following corollary. 

Corollary 1. There is a 34- approximation for minimizing the average distor- 
tion of a non-contracting embedding of a given finite metric into a k-spider. 

Embeddings into a line: Improved guarantee. We can get a better ap- 
proximation factor for embeddings into the line by employing a slightly different 
strategy. Instead of using the result of Fakcharoenphol et alas a black box, we 
instead give an algorithm to find a 1-repairman tour (i.e., a minimum latency 
tour) that is within a factor of 14.36 of the optimum 2-repairmen tour in the 
given metric. Since a 1-repairman tour is also a 2-repairmen tour (with the sec- 
ond repairman doing nothing), we can then apply Theorem 2 to bring down the 
overall approximation ratio to 28.72. 

The idea behind the algorithm is the same as in scaled search, due to Blum 
et al. [10]; here is an outline. To find a 1-repairman solution centered at r: 

for j = 0, 1, 2, 3, ... , do 

Tj •<— tree rooted at r spanning the most vertices among those 
with cost < 2l+^. 

Concatenate Euler tours of the trees Tj (in increasing order of j), to form 
a 1-repairman path. 

Lemma 2. The cost of the 1-repairman tour produced by the preceding algorithm 
is within a factor 32 of the cheapest 2-repairman tour. 

th 

Proof. Let vertex v be the closest vertex to root r in the optimal 2-repairman 
tour. Let the distance of v from the root r in the tour be between [2l, 2l+^) in the 
optimal solution. Consider the tree Tj of cost 2-^+^ constructed by our algorithm. 
We claim that Tj spans at least i vertices. Thus cost of vertex in our tour 
has latency at most 

ELo(cost of tour) < ZLo 2 ■ 2*+^ < 2^+^ 

•tb 

Hence, the distance of the vertex in our 1-repairman tour is at most 16 times 
its counterpart in the optimal 2-repairmen tour. 

Although the problem of finding the largest tree with cost at most 2l+^ is 
NP-hard, we can find a tree having as many vertices as the this optimal tree 
instead (but with cost at most 2 • 2^+^ using Garg’s [19] algorithm for z-MST. 
This increases the overall approximation factor to 16 • 2 = 32. 
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Lemma 3. We can find a 1-repairman tour with cost < 14.36 times the cost of 
the cheapest 2-repairman tour. 

Proof. (Sketch) Let 6 be a real number greater than 1 to be chosen later. Let 
c = , where C/ is a real number chosen uniformly at random from the interval 

[0, 1]. Instead of finding the trees of cost 2, 4, 8, . . . which cover the most vertices, 
we will find the trees of cost at most 2c,2cb,2cb^ , . . . which cover the most 
vertices. Using the methods of Goemans and Kleinberg, we can show that the 
approximation ratio of the previous proof can be improved to 14.36. 

Note that this improves on the result of Fakcharoenphol et al. [14] for the 
special case of the 2-repairman problem. An application of Theorem 2 now gives 
us the following: 

Theorem 3. There exists a 28. 72 -approximation algorithm to embed a given 
(weighted) metric it into a line in a non- contracting fashion to minimize the 
average distortion. 

As a consequence of the analysis in Lemma 2, we also get the following result: 



Lemma 4. For I <k, we can find an l-repairman tour with cost at most 17 (k/l) 
times that of the optimal k-repairman tour. 

We note that the factor j in the above Lemma is necessary as demonstrated 
by the metric induced by an unweighted star graph. Compare the above result 
to that of Fakcharoenphol et al. [14] which outputs a /c-repairmen tour of cost 
0(y) times the minimum ^-repairmen tour for k > I (where the factor j is not 
necessary since the algorithm delivers a solution with more repairmen than the 
optimal compared against). 

3 Approximation Schemes for Trees 

In this section, we restrict our attention to the special case of tree metrics. We 
give a quasi-polynomial time approximation scheme for minimizing the average 
distortion for embeddings into the line metric. Our algorithm is based on the 
QPTAS given by Arora and Karakostas for the minimum latency problem [6]. 
They proved that a near-optimal latency tour can be constructed by concate- 
nating 0(loglU]/e) optimal TSP subtours, and the best such solution can be 
found by dynamic programming. 

For an embedding / : U — >■ IR into the line, let the span of the embedding be 
defined as \f{x) — f{y)\, the maximum distance between two points on 

the line. We note that an embedding with the shortest span is just the optimal 
TSP tour. While embedding a given metric into the line metric, minimizing the 
span of the embedding could result in very high average distortion. However, 
we show that it suffices to minimize the span locally to find near optimal em- 
bedding. In particular, our solution within (1 -|- e) of optimal minimum average 
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distortion is to find an embedding that is the union of 0(log |F|/e^) TSP tours 
with geometrically decreasing number of vertices. 

In the sequel, we use n to denote \V\, the number of vertices. For our algo- 
rithm, we assume that all the edge lengths are in the range [l,n^/e]. Indeed, if 
D is the diameter of the metric space and u and v are two vertices such that 
d{u, v) = D, then y&v v) — then 

merge all pairs of nodes with inter-node distance at most eD/n?, which affects 
the sum of distance by at most enD. Hence the ratio of maximum to minimum 
nonzero distance in the metric can be assumed to be n^/e. 

Relation to TDTSPs. We first show that the Arora-Karakostas QPTAS works 
also for the case of Time Dependent Traveling Salesman Problem (TDTSP) 
defined by Blum et al.. In the TDTSP, the objective is to minimize a positive 
linear combination of the TSP tour value and the total latency of the tour. The 
intuition behind this is that adding a component of TSP in the objective value 
preserves the property that the tour composed of TSP tours continues to remain 
near-optimal. 

We now describe how to break up an optimal tour into locally optimal seg- 
ments. Let T denote the optimal tour for the objective function aTSP + (3LAT 
where TSP and LAT denote the span and latency objective values of the tour 
respectively. We break this tour into k segments (fc depends on the input pa- 
rameter e). In segment i we visit rii nodes, where 

n* = [(I -k for 1 = I,..., fc- I; = [I/e] 

Note that these ru’s are chosen in such a way that rn < Denote 

J2j>i '^3 by Ti. Replace the optimal tour in each segment, except the last one, by 
the minimum-distance traveling salesman tour for that segment. The new tour 
now consists of the concatenation of 0(logn/e) locally optimal TSP tours. This 
gives us the following lemma. 

Lemma 5. There is a tour that is a concatenation of 0{\ogn/ e) TSP tours that 
has aTSP + PLAT objective value at most (1 -k e) times the minimum. 

We now use the Lemma 5 to show the following theorem for average distance. 

Theorem 4. Any finite metric has a non- contracting embedding into a line that 
is composed of 0(logn/e^) minimum TSP tour segments with average distortion 
no more that (1 -k e) times the minimum possible over all such embeddings. 

Proof. Our strategy is same as in Lemma 5. Consider the optimal embedding of 
the input tree into a line. We break this embedding up into 0(logn/e) segments. 
Let Ui be the size of tth segment defined as before. We now divide the objective 
function value according to the segments, so that only the share Ci of segment 
i changes, if we replace the embedding of segment i with a different embedding. 

Let Ti be the length of the embedding of segment i. If ig is the left-most 
node in the embedding of the segment i, then let Li = ^(*o> j) be the sum 
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of the distances of all nodes in segment i from node v. Note that Li is the total 
latency of vertices in segment i with ig as root. And let Di = J2u ven- 
the sum of all the pairwise distances in segment i. 

Let Qi = rij and be the number of total nodes to the left 

and right of segment i respectively. 

The contribution of the segment i to the objective comes from the following 
distinct terms. 

1. If a vertex u is to the left of the segment i and a vertex v is to the right, 
then the segment i adds Ti to the distance between them. 

2. If a vertex u is to the left and w is in the segment i, then the contribution 

is w) = the distance from the left most vertex ig of the segment i to w. 

3. If a vertex v is to the right and w is in the segment i, then the contribution 
is Ti - l{io,w). 

4. If both the vertices w and w' are in the segment i, then the contribution is 
l{w, w'). 

These contributions, when summed up over all pairs of vertices, give: 

Ci = QiViTi + QiLi + ri(riiTi — Li) + Di ( 2 ) 

Note that Di < nlTi. For i = 2, . . . ,k, we know that rii < qi and rii < e ■ Ti. 

Hence, comparing Di with the first term in (2), we get 



(1 + e){qiriTi + qiLi + ri{riiTi - Li)) > Ci > qiViT^ + q^L^ + ri{riiTi - Li) (3) 

To prove the statement in Theorem 4, it suffices to find a tour that is within 
(1 + e) of the lower bound in the RHS of the above inequality 3. The expression 
for the lower bound on the RHS of inequality 3 is a linear combination of TSP 
and Latency values of the tour in segment i. We can apply Lemma 5 to obtain 
a tour composed of Oilogrii/e) TSP tours. This tour is within (1 + e) factor of 
the lower bound on Ci. 

A technical detail in this argument is that the coefficient of Li could be 
negative. Lemma 5 does not handle this case. But note that UiTi — Li is the 
total “reverse” latency in segment i with the rightmost endpoint being the root. 
Thus we can rewrite the lower bound as a linear combination of Ti and riiTi — Li 
with positive coefficients. 

We can thus replace each segment z, with a concatenation of 0(logrzi/e) TSP 
tours, without increasing the cost by more than a factor of (1 + e). Since there 
are 0(logn/e) segments in all, it follows that there is an embedding consisting 
of 0(log^n/e^) shortest TSP tours. 

Finally, we show how to reduce this number down to 0(logn/e^). Let us 
rewrite the lower bound in (3) as {qi — ri)Li + {qi + rii)riTi. Note that Li < riiTi. 
This gives us that the term {qi — ri)Li is at most e • {qi + rii)riTi, whenever 
qi — Ti is positive. Hence, if we replace the segment i with a shortest TSP tour 
on those vertices, the cost will be within (1 + e) of the lower bound in (3). It is 
easy to check that, for i > I/e, we have qi > ri. Hence for z = I, . . . , I/e, using 




Approximation Algorithms for Minimizing Average Distortion 243 



Lemma 5, we replace each segment by a concatenation of 0(logn/e) tours each. 
Then for the segments i and above, we use only one minimum TSP tour. Overall 
this results in a concatenation of 0(logn/e^) tours with near-optimal average 
distortion. 

Note that, an optimal TSP tour of the tree is an Euler tour. In other words, 
each edge is crossed exactly twice, once in each direction. As a consequence, we 
have the following. 

Theorem 5. There exists a non- contracting embedding of a tree metric into a 
line with average distortion at most (1 -I- e) times the minimum possible that, 
when viewed as a walk, crosses every tree edge 0(logn/e^) times. 

Now using dynamic programing using these structural results proves the 
following theorem. 

Theorem 6. For any given e > 0, there is an algorithm that runs in time 
j.jO(iogn/e ) computes a non- contracting embedding of a given input tree met- 
ric into a line with average distortion at most (1 -I- e) -times the minimum. 

Proof (Sketch) 

We now describe the quasi-polynomial-time approximation scheme based on 
dynamic programming. Theorem 5 can be restated in terms of crossings of ver- 
tices. Consider a separator vertex for the tree. We will denote the partition of 
the tree at the centroid as the left and right parts. There exists a near optimal 
embedding that, when viewed as a tour, crosses the separator node from left half 
to right half 0(logn/e^) times. 

We develop a dynamic program based on the above observation. Given the 
input tree, we try each vertex as the starting point of our tour. In order to 
compute the tour, we first find a separator node in the tree. For the dynamic 
program, we maintain the following state space. Consider the sub-tours formed 
between successive places where we cross the separator node. We guess the num- 
ber of nodes and the length for each of these sub-tours. Note that since there 
are only 0(logn/e^) crossings, there are only ^ choices for the number 

of nodes. Moreover, the length of each tour can take at most 0(logn/e) differ- 
ent values. Thus the number of choices for the length are bounded by about 
0((logn)'°®”). Thus the total size of state space is n^(^°s^/P) ^ Finding the best 
tour given the lengths of sub-tours can be done by recursing on the left and right 
parts independently. For each of these sub-tours, we want to visit all the vertices 
while staying on one side throughout. The total running time of this procedure 
is 

4 Open Problems and Discussion 

It is important to note that a non-contracting embedding can be converted to 
a non-expanding embedding by scaling down all the distances. However, the 
converse is not true, since in non-expanding embeddings, the host metric could 
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be a semi-metric. In other words, mapping two points in the guest metric to 
the same point in the host metric is allowed. This represents a crucial difference 
between the two problems. 

For the case of non-contracting embeddings considered in the paper, here are 
some open questions : 

(1) Is there a simpler and better approximation algorithm for minimizing average 
distortion in trees? 

(2) Can the Quasi-PTAS be extended to (outer)planar graphs? 

(3) A different but related objective function is sum of the distortions of all 
pairs over all non-contracting embeddings. Are there approximation algorithms 
for this objective? 

(4) For the case of weighted average distance, we can write a linear program 
based on the spreading metric LP for minimum linear arrangement (a la Rao & 
Richa [32]). However, the integrality gap of this LP is as yet unknown. 
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Abstract. Under the robot model, we show that a robot needs 
D(nlogd) bits of memory to perform exploration of digraphs with n 
nodes and maximum out-degree d. We then describe an algorithm that 
allows exploration of any n-node digraph with maximum out-degree d 
to be accomplished by a robot with a memory of size 0(nd log n) bits. 
Under the agent model, we show that digraph exploration cannot be 
achieved by an agent with no memory. We then describe an exploration 
algorithm for an agent with a constant-size memory, using a whiteboard 
of size O(logd) bits at every node of out-degree d. 



1 Introduction 

A mobile entity (e.g., a software agent or a robot) has to explore a graph by 
visiting all its nodes and traversing all edges, without any a priori knowledge 
of the topology of the graph nor of its size. Once exploration is completed, 
the mobile entity has to stop. We also consider the more demanding task of 
exploration with return in which the entity has to return to its original position, 
and the auxiliary easier task of perpetual exploration in which the entity has to 
traverse all edges of the graph but is not required to stop. The task of visiting all 
nodes of a network is fundamental in searching for data stored at unknown nodes 
of a network, and traversing all edges is often required in network maintenance 
and when looking for defective components. Perpetual exploration may be of 
independent interest, e.g., if regular control of a network for the presence of 
faults is required, and all edges must be periodically traversed over long periods 
of time. 

If nodes and edges have unique labels, exploration can be easily done (e.g., by 
depth-first search). However, in some navigation problems in unknown environ- 
ments such unique labeling may not be available, or limited sensory capabilities 
of the mobile entity may prevent it from perceiving such labels. Hence it is impor- 
tant to be able to program the entity to explore anonymous graphs, i.e., graphs 
without unique labeling of nodes or edges. Arbitrary graphs cannot be explored 
under such weak assumptions, as witnessed by the case of a cycle: without any 
labels of nodes and without the possibility of putting marks on them, it is clearly 
impossible to explore a cycle of unknown size and stop. Hence, we assume, as 
in [5,6,11], some ability of marking nodes. More precisely we consider two differ- 
ent models. In the robot model, the mobile entity is given the ability of dropping 
and removing indistinguishable pebbles at nodes. This model aims to capture the 
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behavior of a robot in a labyrinth. In the agent model, the mobile entity is given 
the ability to read and write messages at memory locations available at each 
node, called whiteboards. This model aims to capture the behavior of a software 
agent in a computer network. Observe that the robot model is weaker than the 
agent model since a robot that is given k pebbles acts as a software agent in a 
network with whiteboards of size one bit, in which at most k whiteboards can 
simultaneously contain a 1. 

Clearly the robot has to be able to locally distinguish ports at a node: oth- 
erwise it is impossible to explore even the star with 3 leaves (after visiting the 
second leaf the robot cannot distinguish the port leading to the first visited leaf 
from that leading to the unvisited one). Hence we make a natural assumption 
that all ports at a node are locally labeled 1, . . . , d, where d is the degree of the 
node. No coherence between those local labelings is assumed. 

In many applications, robots and mobile agents are meant to be simple, often 
small, and inexpensive devices which limits the amount of memory with which 
they can be equipped. As opposed to numerous papers that imposed no restric- 
tions on the memory of the robot and sought exploration algorithms minimizing 
time, i.e., the number of edge traversals, we investigate the minimum size of the 
memory of the robot that allows exploration of graphs of given (unknown) size, 
regardless of the time of exploration. That is, we want to find an algorithm for 
a mobile entity performing exploration using as little memory as possible, i.e., 
we want to minimize the memory of the robot in the robot model, and we want 
to minimize both the amount of information transported by the agent and the 
size of the whiteboards in the agent model. In the latter case, our specific goal 
is to design an exploration algorithm for an agent with constant memory size, 
using small whiteboards. 



1.1 Our Results 

Under the robot model, we first prove a lower bound of i7(n log d) bits of memory 
for perpetual exploration of n-node digraphs with maximum out-degree d. This 
lower bound holds even if the robot is given a linear amount of pebbles. We 
then present two algorithms for exploration with stop in digraphs. One requires 
0(nd log n) bits of memory, and uses one pebble. This algorithm is only O(logn) 
away from the optimal in constant-degree digraphs. Its time performance is 
however exponential (again, time is measured by the number of edge traversals). 
Hence, we also describe another algorithm, which performs exploration with 
stop in polynomial time, but requires O(n^dlogn) bits of memory, and uses 
O(loglogn) pebbles. This latter algorithm is a variant of the algorithm in [5], 
designed for the purpose of compressing the robot memory. Note that it has 
been proved [5] that l7(loglogn) pebbles are required to explore in polynomial 
time, thus our algorithm is optimal with regard to the number of pebbles. 

Under the agent model, we first prove that exploration with stop cannot 
be achieved by an oblivious agent, i.e., an agent carrying no information when 
moving from one node to another. However, we describe an algorithm for an 
agent with constant size memory. It performs exploration with return in all 
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digraphs, using a whiteboard of size 0(log(i) at every node of out-degree d. 
Note that f2(log d) bits is a lower bound for the size of the whiteboards when 
using an agent with constant-size memory. Indeed, an agent with a memory of 
size k = 0(1) and a whiteboard of size k' = o(logd) generate at most 2^+^ < d 
states, and hence not all out-going edges can be distinguished. Our algorithm is 
also optimal according to the following criteria. We mentioned before the lower 
bound of l7(nlogd) bits of memory for exploration in digraphs under the robot 
model. Therefore our algorithm under the agent model demonstrates that the 
memory of the robot can be optimally distributed among the n nodes of the 
digraph. This is in contrast with other contexts (e.g., compact routing) in which 
there is a penalty for the distribution of a centralized data structure. 

1.2 Related Work 

Exploration and navigation problems for robots in an unknown environment 
have been extensively studied in the literature (cf. [14]). There are two groups 
of models for these problems. In one of them a particular geometric setting is 
assumed. Another approach is to model the environment as a graph, assuming 
that the robot may only move along its edges. The graph setting can be further 
specified in two different ways. In [1,5, 6, 9] the robot explores strongly connected 
directed graphs and it can move only in the direction from head to tail of an 
edge, not vice-versa. In [2,7,10,11,12,16] the explored graph is undirected and 
the robot can traverse edges in both directions. (See also [13] an the references 
therein where parallel search is investigated.) In the graph setting it is often 
required that apart from completing exploration the robot has to draw a map 
of the graph, i.e., output an isomorphic copy of it. 

Graph exploration scenarios considered in the literature differ in an impor- 
tant way: it is either assumed that nodes of the graph have unique labels which 
the robot can recognize, or it is assumed that nodes are anonymous. It is impos- 
sible to explore arbitrary anonymous graphs if no marking of nodes is allowed. 
Hence the scenario adopted in [5,6] was to allow pehhles which the robot can 
drop on nodes to recognize already visited ones, and then remove them and 
drop in other places. The authors concentrated attention on the minimum num- 
ber of pebbles allowing efficient exploration and mapping of arbitrary directed 
n-node graphs. (In the case of undirected graphs, one pebble suffices for efficient 
exploration [11].) In [6] the authors compared exploration power of one robot 
with pebbles to that of two cooperating robots. In [5] it was shown that, to per- 
form exploration in polynomial time, one pebble is enough if the robot knows an 
upper bound on the size of the graph. However, without the knowledge of any 
bound on the size of the graph, 6>(loglogn) pebbles are necessary and sufficient 
for exploration in polynomial time. 

The efficiency measure adopted in most papers dealing with graph explo- 
ration is the completion time of this task, measured by the number of edge 
traversals. On the other hand, there are no restrictions imposed on the mem- 
ory of the robot. Minimizing the memory of the robot for the exploration of 
anonymous non-directed graphs has been addressed in, e.g., [8,10,16,17]. Most 
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of previous works deal with perpetual exploration. For instance, it is shown 
in [17] that, with no pebble, no finite set of finite automata can perform perpet- 
ual exploration of all cubic planar graphs. Using a pebble, exploration with stop 
of undirected graphs is much facilitated by the ability of backtracking. In partic- 
ular, it is easy to design an exploration algorithm for a robot with 0{D log d) bits 
of memory, where D denotes the diameter of the graph. Also, a simple variant 
of the algorithm in [11] yields a bound 0{n\ogd) bits. Better bounds are known 
for specific families of graphs. For instance, it is shown in [10] that exploration 
with stop in n-node trees requires a robot with memory size 12(log log log n), and 
that exploration with return in n-node trees can be achieved by a robot with 
O(log^n) bits of memory. Our paper focuses on directed graphs. 

It is worth mentioning that our work has connections with derandomized 
random walks (cf. [10] and the references therein). There, the objective is to 
produce an explicit universal traversal sequence (UTS), i.e., a sequence of port 
labels, such that the path guided by this sequence visits all edges of any graph. 
However, without the a priori knowledge of n, non of these UTS allows the 
robot to stop. Moreover, even if bounds on the length of these sequences have 
been derived, they provide little knowledge on the minimum number of states 
for graph exploration by a robot. For instance, sequences of length l7(nlogn) 
are required to traverse all degree 2 graphs with n nodes [3], although a 2-state 
robot can explore all degree-2 graphs. 

2 Terminology and Models 

An anonymous graph (resp., digraph) with locally labeled ports is a connected 
graph (resp., strongly connected digraph) whose nodes are unlabeled, and edges 
incident to a node v have distinct labels l,...,d, where d is the degree of v. Thus 
every undirected edge {u, u} has two labels which are called its port numbers 
at u and at v. Port numbering is local: there is no relation between labels at u 
and at v. In digraphs, edges out-going from a node v have distinct labels l,...,d, 
where d is the out-degree of v. Edges incoming to a node v are not labeled at v. 

We are given a mobile entity traveling in an anonymous (di)graph with locally 
labeled ports. The graph and its size are a priori unknown to the entity. We 
consider the two following models. 

Robot model. The mobile entity is called a robot. A robot with fc-bit memory is 
a finite automaton oi K = 2^ states among which a specified state S'o is called 
initial and some specified states are called final. The robot is originally given a 
source of indistinguishable pebbles. If the robot is in a node u in a non- final state 
S, this state determines a local port number p, and the decision of dropping a 
pebble at v, removing a pebble from v (if such a pebble is currently present at 
v), or doing nothing. Then the robot leaves the node by port p. Upon traversing 
the corresponding edge, the behavior of the robot differs depending whether the 
graph is directed or not. 

In graphs, the robot reads the port number i at the node it enters, and the 
degree d of this node. It also detect the presence or not of a pebble at this node. 
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6 = 0 if no pebble, and 6=1 otherwise. The triple (i,d,b) is an input symbol 
that causes the transition from state S' to S". 

In digraphs, the robot reads the out-degree d of the node it enters, and check 
the presence or not of a pebble at this node. The pair (d, b) is an input symbol 
that causes the transition from state S to S'. 

In both cases, the robot continues moving in this way until it enters a final 
state for the first time. Then it stops. 

Agent model. The mobile entity is called agent. An agent with /c-bit memory is 
a pair (V,M) where P is a constant size program, and AI is a memory of size k 
bits. In the agent model, every node is given computing facilities, including CPU 
and q bits of local memory. The local memory is called whiteboard. Initially, all 
whiteboards are empty, and the agent memory contains an initial fc-bit binary 
string sq. Every pair agent-node forms a system that acts as a finite automaton 
of 2^^+^ states. A state of the system is a pair S = (s,w) where s is the content 
of the agent memory, and to is the content of the whiteboard. This includes 
some specified states called final. When the system is in a non-final state S, 
the agent is operated as follows. The state S determines a local port number p, 
a fc-bit binary string s, and a q-hit binary string uj. Then lo is written on the 
whiteboard, s is stored in the agent memory, and the agent is sent through port 
p. Upon reception of the agent by a node, the operation performed by that node 
differs depending whether the graph is directed or not. 

In graphs, let i be the port number through which the agent enters the current 
node. Let d be the degree of that node, and let s and uj be the current contents of 
the agent memory and the node whiteboard. The pair (i,d) is an input symbol 
that causes transition of the system from state S = (s,uj) to S' = {s',uj') by 
application of program V. 

In digraphs, the out-degree d is an input symbol that causes the transition 
from state S = (s,uj) to S' = {s',uj') by application of program P. (There is no 
access to the input port number.) 

In both cases, the agent continues moving in this way until it enters a final 
state for the first time. Then it stops. 

Remark. Most of our exploration algorithms under the robot model actually 
perform in the weakest version of the model, i.e., when the robot is given a 
unique pebble. 

We consider three tasks of increasing difficulty: perpetual exploration in which 
the mobile entity has to traverse all edges of the (di)graph but is not required 
to stop, exploration with stop (often simply called exploration in this paper) in 
which starting at any node of the graph, the entity has to traverse all edges and 
stop at some node, and exploration with return in which starting at any node 
of the graph, the entity has to traverse all edges and stop at the starting node. 
An entity is said to perform one of the above tasks in a (di)graph, if starting at 
any node of this graph in the initial state, it completes this task in finitely many 
steps. (Notice that in the case of perpetual exploration, completing this task 
after finitely many steps means only traversing all edges, not necessarily stopping 
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after it.) We compute the memory requirement of an exploration algorithm by 
measuring either the size of the robot memory in the robot model, or both the 
size of the agent memory and the size of the whiteboards in the agent model. 

Terminology. A one-to-one and onto mapping / between the two sets of nodes 
V and V of two edge-labeled graphs G = {V, E) and G' = (W, E') is an isomor- 
phism if, for every two nodes x and yinV: (x,y) € E f{y)) € E' , and 

the two edges have the same label. In the map drawing problem, the robot (resp., 
the agent) has to compute an edge-labeled graph G such that G is isomorphic to 
X where X is the unknown edge-labeled graph that the robot (resp., the agent) 
is exploring. Given two digraphs G and X, and two nodes u and x of G and X, 
respectively, we note (G, u) = (X, x) if there exists an isomorphism / between 
G and X, such that f{u) = x. 

3 Exploration of Directed Graphs under the Robot 
Model 

We first prove a lower bound on the size of the robot memory. The proof uses 
the digraph combination lock (see, e.g., [15]) defined as follows. 

Definition 1. The combination lock Ld,n is a regular digraph of out-degree d, 
and order n. The n nodes uq,ui, . . . ,Un-i are connected as follows. For every 
i <n—l, node Ui has one out-going edge pointing to rti+i, and d—1 out-going 
edges pointing to uq. Node u„-i has all its d out-going edges pointing to uq. 

Theorem 1. Perpetual exploration in n-node digraphs of maximum out-degree 
d > 2 cannot he accomplished by a robot with less than G(nlog d) bits of memory, 
even if it is given up to n pebbles. For d= 2, the result holds even if the robot is 
given up to n/2 pebbles. 

Proof. Let us given d and n, and a robot able to explore all n-node digraphs 
of maximum out-degree d, thus including all distinct edge-labeled combination 
locks Ld,n- Assume that the robot is given k pebbles, fc > 1. A full run of the 
robot in Ld,n is a run of the robot along the path uq,ui,. . . ,n„_i. For every 
edge-labeled combination lock, place the robot at node uq, and let us consider 
the state of the robot at Uq before its first full run. (For each exploration, there 
are at least d full runs since node u„_i must be reached at least d times to 
traverse its d out-going edges.) Since the n nodes Ui, i = 0,...,n — 1, look 
identical to the robot up to the presence of a pebble, the ability to perform a 
full run is determined by the state of the robot just before leaving node uq, and 
by the positions of the k pebbles. There are different labelings of the edges 
{ui, Ui+i), i = 0, . . . ,n — 2, and p = («) possible positions for the pebbles. 

Therefore, the robot must be able to be in at least jp different states at uq. 
Thus it must have at least |"(n — 1) logd — logp] bits of memory. Since p < 2”, 
the result follows for d > 2. For d = 2, we use the fact that ()]) < (^)*' for 
0 < a < 6, where Ine = 1. Since k < n/2, we have p < fc(^)", and thus 
logp < logfc-l- nlog(^). We have k < n/2 < 2n/e, thus logp < logn-|- an with 
a < 1, which completes the proof. □ 
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Note that our exploration algorithms use much less than 0(n) pebbles. One 
of them uses only one pebble, and the other uses O(loglogn) pebbles. We first 
sketch the description of an exploration algorithm, called Test-all-maps, sat- 
isfying the following: 

Theorem 2. Under the robot model, Algorithm Test-all-maps accomplishes 
exploration with stop in any digraph with a robot using one pebble, and whose 
memory does not exceed 0(nd log n) bits in n-node digraphs of maximum out- 
degree d. 

We first sketch the description of Algorithm Test-all-maps and later prove 
that it satisfies the statement of Theorem 2. 



Algorithm Test-all-maps. The robot successively tries every value for n, start- 
ing at n = 1. For a fixed n, the robot tries all possible maps of edge-labeled 
digraphs of order n. For a given map G = (V,E), with V = {vi,...,v„}, 
the robot proceeds as follows. Let x be the current position of the robot in 
the unknown digraph X, and assume that the robot holds the pebble. The 
robot chooses node V\ G V, and tests whether it is standing on node Vi of 
G, i.e., whether (G, wi) = (X,x). This is done thanks to the use of Procedure 
Check-Consistency that will be detailed later in the text. This procedure takes 
as input a graph G and a node v of G, and tests whether the robot is currently 
standing at v in G. If the test succeeds, then the exploration stops. Otherwise, the 
robot chooses another node V 2 G V, and tests whether (G,V 2 ) = (X,x). Observe 
that during Procedure Check-Consistency, the robot moves in the graph X, 
and thus, since the procedure failed for vi, there is no guarantee that the robot 
is yet standing at node x of X. Hence, the robot uses a linear array position, 
of size n, such that position[z] is the index j of the node Vj G V where the 
robot would be now standing if the original position x of the robot would satisfy 
X = Vi- Assuming x is node V 2 of G, the robot would now stand on node Vj, 
j = position[2]. The robot thus executes procedure Check-Consistency with 
input (G,Vj). If the procedure succeeds, then the exploration stops. Otherwise, 
the robot chooses the next node V 3 , and tests whether (G, U 3 ) = (X,x). The 
robot thus executes procedure Check-Consistency with input (G,Vj) where 
j = position[3]. This process is carried on until either a test is eventually sat- 
isfied, or all nodes of G have been exhausted. In the latter case, the robot picks 
the next map, and repeats the same scenario until if finds the map of the a priori 
unknown explored digraph. Now, we describe procedure Check-Consistency. 



Procedure Check-Consistency(G, u). Given the map of an edge-labeled graph 
G = (V,E), with n nodes and maximum out-degree d, and given a node u of G, 
Procedure Check-Consistency checks whether the robot is currently standing 
at node u of G, i.e., whether (G, u) = {X, x) where x is the current position of the 
robot in the unknown digraph X. The procedure borrows from [5] the technique 
of marking nodes of a cycle. However, this technique is implemented without the 
use of a large data-structure. More precisely, the robot assigns numbers, from 1 to 
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m, to all the m < nd edges of the map G, with the additional condition that the 
edge labeled 1 is out-going from u, and the edge labeled m is incoming to u (such 
an edge does exist because G is strongly connected). Thus E = {ci, . . . ,em}- 
For every t G {1, . . . , m — 1}, the robot computes a shortest path Pi in the map 
G starting from the head of edge Cj to the tail of edge Cj+i. During Procedure 
Check-Consistency, the paths Pi's are computed on-line, and at most one path 
is stored at any given time in the robot memory. Let G be the following closed 
walk starting and ending at rt: C = ei, Pi, . . . , em-l^ Pm-i) This walk will be 
traversed several times during the execution of Procedure Check-Consistency. 
G is thus recomputed several times by the robot, and when Pi is computed, the 
robot forgets about path Pi_i. 

There are at most n phases in Procedure Check-Consistency, one for every 
node of G. (The procedure assumes that the robot holds the pebble. Otherwise, 
the robot runs Procedure Find-Pebble described later.) For every phase there is 
a new considered node. During Phase i, the robot leaves u with the pebble, and 
follows the edges of C until it visits a node vvaG that has not yet been considered 
during the i — 1 previous phases. This node is marked considered on the map of 
G, and the pebble is dropped there. (Hence the first considered node is node u.) 
Then the robot carries on its walk guided by C until, according to the map, it is 
back at u. Now, the robot traverses G again. During its way along G, it checks 
the following property P: the token is at the current node x if and only if x is the 
considered node w, according to the map of G. If property V is satisfied for every 
node of G, then the robot follows G once again to bring the pebble back to u. If 
there is yet another node to be considered, then the next phase proceeds with this 
node. Otherwise the robot completes Procedure Check-Consistency as follows. 
It executes a last journey along G to check whether there is equality between 
the degree of each node in the map G, and the degree of the corresponding node 
in the explored graph X. If so, the robot returns success. The robot turns into 
state failure as soon as it detects a problem at any step (e.g., the pebble is not 
where it should be, the pebble is where it should not be, the degree-sequences 
are different in the map and in the explored graph, etc.). As in [5], we have: 

Lemma 1. Given a robot at node x of an anonymous digraph X, Procedure 
Check-Consistency returns success for (G,u) if and only if {G,u) = (A, x). 

If the robot loses the pebble during the execution of Procedure Check-Con- 
sistency(G, m), then either the map G is not correct, or it is correct but the 
robot was not at u. The robot then looks for the pebble by running the following 
procedure: 



Procedure Find-Pebble. The robot computes a (non necessarily simple) closed 
path P in the map G, visiting all nodes {wi, . . . ,Vn} of G. P is computed on- 
line, e.g., P is a sequence of sub-paths Pi from Vi to t’i+i, i = l,...,n — 1, 
and the Pfs are computed one after the other. The robot traverses the path P 
several times, successively assuming that it starts from a node Vi of the map, 
i = 1, . . . ,n, and using an array position as in Procedure Test-all-maps. If 
the robot does not find the pebble, then the current map G is for sure not a 
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map of the explored digraph X. Therefore the robot considers the next map, 
and looks for the pebble in this new map using the same strategy as above. The 
robot proceeds this way until it finds the pebble when considering some map 
H. Once the pebble is found, the robot returns to the execution of Procedure 
Test-all-maps, and tests the current map H . 

Proof of Theorem 2. We prove that the algorithm Test-all-maps can be imple- 
mented so not to use more than 0 {ndlog n) bits of memory in n-node digraphs of 
maximum out-degree d. It is easy to list all edge-labeled digraphs with at most 
n nodes and maximum out-degree d using an array of 0{ndlogn) bits. Since 
the cycle C = ei, Pi, 62 , P 2 , ■ ■ ■ , em-h Pm-i, Cm visiting all edges of a given map 
is computed on the fly, and since any path Pi can be encoded by a sequence 
of at most D labels, where D is the diameter of G, we get that Procedure 
Check-Consistency requires 0{D\ogd) < 0{nlogd) bits of memory for the 
storage of C. The same holds in Procedure Find-Pebble for the storage of P. 
Thus the robot does not use more than 0{ndlogn) bits of memory in total. □ 
The algorithm Test-all-maps performs exploration in exponential time in 
the worst case (recall that time is counted as the number of edge traversals). Nev- 
ertheless, we can describe a variant of Algorithm Explore-and-Map presented 
in [5]. Although polynomial in time, Explore-and-Map is costly in term of mem- 
ory space: a rough analysis shows that it requires a memory of 0 {rr‘dlogn) bits. 
Our variant is called Compacted-Explore-and-Map. We summarize its perfor- 
mances by the following: 

Theorem 3. For any n-node digraph of maximum out-degree d, Algorithm Com- 
pacted-Explore-and-Map accomplishes exploration with stop in polynomial time 
under the robot model, with a robot using O(loglogn) pebbles and a memory of 
size 0 {n‘^dlogn) bits. 

4 Exploration of Directed Graphs under the Agent 
Model 

This section is dedicated to the agent model, i.e., nodes are given whiteboards 
on which the agent can read, erase, and write messages. The goal is to limit the 
sizes of both the agent memory, and the nodes’ whiteboards. We first observe 
that exploration is impossible with an agent that performs obliviously, that is 
carrying no information from node to node. 

Theorem 4. Under the agent model, exploration with stop cannot be achieved 
by an agent with zero bit of memory. 

Proof. Assume for the purpose of contradiction that exploration with stop can 
be achieved by an agent with zero bit of memory. Then consider regular digraphs 
of out-degree d> 2. The content coi of the whiteboard of a node u at the tth visit 
of that node by the agent is independent of u. Therefore Wi+i = f{w>i), where 
/ is a function that is uniquely defined by the program V of the agent. Thus 
the decision to stop depends only of the number of times the agent visits the 




Digraphs Exploration with Little Memory 255 



same node. Let k be the smallest integer such that ujk is a final state. Let Ld,fc+i 
be the combination lock of out-degree d and order fc -|- 1. To traverse all edges 
incoming to the first node Uq of the agent must visit node Ug at least 

A: -I- 1 times. Since it stops at the fcth visit, not all edges have been traversed, 
and thus exploration is not completed, a contradiction. □ 

Theorem 5. Under the agent model, Algorithm DFS accomplishes exploration 
with return in any digraph using an agent with 0(1) bits of memory. DFS uses 
Oflogd) bits of memory per node of out-degree d. 

We first describe Algorithm Next -Port that performs perpetual exploration 
in any digraph. 

Algorithm Next -Port. 

1. If the current node whiteboard is empty, then the agent writes 1 on it, and 
leaves the node through port 1; 

2. Otherwise let i be the integer written on the whiteboard, and let d be the 
out-degree of the node. The program erases the whiteboard, writes j = 
{i mod d) -F 1 on it, and the agent leaves the node through port j; 

Lemma 2. Algorithm Next-Port accomplishes perpetual exploration of any di- 
graph using an agent with zero bit of memory, and uses 0(log d) bits of memory 
per node of out-degree d. 

Remark. Algorithm Next -Port is used several times as a sub-routine in Algo- 
rithm DFS, and thus will be called with non-empty whiteboards. Nevertheless, 
it was shown [4] that Algorithm Next-Port is self-stabilizing and thus does not 
require the whiteboards to be initially empty to eventually perform correctly. 

Algorithm DFS. Algorithm DFS performs a depth-first search (DFS) in the graph, 
using Algorithm Next-Port as a sub-routine. Nodes visited during the DFS are 
marked visited on their whiteboards. The last visited node is marked last. 
There is at most one node marked last during the execution of DFS. When ex- 
ploration starts, the node on which is placed the agent is marked visited and 
last. It is also marked root. The path from the root to the last node is main- 
tained thanks to port numbers that are stored on the whiteboards during the 
exploration. This path is called the main path. The agent leaves the root through 
port number 1. The DFS will proceeds by successively traversing incident edges 
of any node u in order 1, 2, . . . , d where d is the out-degree of u. Before leaving 
the last node u, the port number through which the agent leaves is stored on u’s 
whiteboard. Assume that the agent then reaches node v. There are two cases, 
depending on whether node v has been visited or not. 

If V has not yet been visited, it is marked visited. The agent then starts 
Algorithm Next -Port to find the root. From Lemma 2, this task will eventually 
succeed. From the root, the agent follows the main path and eventually reaches 
the last node u. There, the mark last is erased from m’s whiteboard. The agent 
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then leaves u by the port whose number is stored on u’s whiteboard, to reach 
V again. Node v is marked last. This sequence of instructions is repeated until 
the agent reaches a node v that has been previously visited during the DFS. 

If the agent reaches a node v that is marked visited, it runs Algorithm 
Next -Port to find the root, and follows the main path from the root to the last 
node u. Once back at u, there are two sub-cases. If the port number p of the 
edge leading to v is smaller than the out-degree d of u, then the agent leaves u 
through port p + 1, and repeats the same sequence of instructions as described 
before. If p = d, then the agent aims to backtrack. For that purpose, it runs 
Algorithm Next-Port to return to the root. The goal of the agent is to find 
the node of the main path that stands just before the node marked last. It 
marks the root as next, and proceed as follows. From the node marked next, 
the agent goes down one step along the main path to reach some node w. If ic 
is not marked last, the agent goes back to the root, follows the main path to 
the node marked next, erases next from the whiteboard of that node, moves 
to w, and mark w as next. This is repeated until the agent finds the last node. 
Then it erases the mark last from the whiteboard of that node, goes back to 
the root using Next-Port, follows the main path until the node marked next, 
and replaces the mark next by last. 

The process above is repeated until all edges out-going from the root have 
been visited, and the last backtrack leads to the root. Then the robot stops. 

Proof of Theorem 5. During the execution of Algorithm DFS, the agent is clearly 
in a constant number of different states, hence a memory of 0(1) bits is enough 
for the agent. There is a constant number of marks written on each whiteboards. 
However, the storage of the port numbers of the main path, as well as the local 
storage used by Algorithm Next-Port (cf. Lemma 2) require whiteboards of size 
O(logd) bits. □ 

Remark. It is possible to call Algorithm Next -Port only once (amortized), and 
to use it to construct a tree whose edges are pointing toward the root. Then 
returning to the root in Algorithm DFS takes a linear time after the first run of 
Algorithm Next-Port. 

5 Conclusion and Further Works 

Our algorithm Test-all-maps requires the storage of a test map of the unknown 
explored digraph. Graph exploration is however a weaker task than map drawing. 
One may thus expect to find an algorithm using a memory smaller than the 
size of a map. Another interesting direction of research is the investigation of 
compact exploration under the constraint that the algorithm must perform in 
polynomial time (i.e., the mobile entity must perform a polynomial number of 
edge-traversals). We described an algorithm for polynomial-time exploration, 
using a robot with a memory of size O(n^dlogn) bits. This is however far from 
the fI{nlogd) lower bound, and it would be interesting to determine the exact 
trade-off between time and memory space for graph exploration. 
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Abstract. Motivated by the wavelength assignment problem in WDM 
optical networks, we study path coloring problems in graphs. Given a set 
of paths P on a graph G, the path coloring problem is to color the paths 
of P so that no two paths traversing the same edge of G are assigned 
the same color and the total number of colors used is minimized. The 
problem has been proved to be NP-hard even for trees and rings. 

Using optimal solutions to fractional path coloring, a natural relaxation 
of path coloring, on which we apply a randomized rounding technique 
combined with existing coloring algorithms, we obtain new upper bounds 
on the minimum number of colors sufficient to color any set of paths on 
any graph. The upper bounds are either existential or constructive. 

The existential upper bounds significantly improve existing ones provided 
that the cost of the optimal fractional path coloring is sufficiently large 
and the dilation of the set of paths is small. Our algorithmic results 
include improved approximation algorithms for path coloring in rings 
and in bidirected trees. Our results extend to variations of the original 
path coloring problem arizing in multifiber WDM optical networks. 



1 Introduction 

We study path coloring problems in graphs. Let F be a set of paths on a graph 
G and /c > 0 be an integer. The paths of P and the edges of G may be directed 
or undirected. The path fc-coloring problem (or, simply, path coloring when 
k = 1) is to assign colors to the paths of P in such a way that at most k paths 
with the same color share an edge of the graph and the total number of colors is 
minimized. The problem has been proved to be NP-hard, even for k = 1 and even 
for the simplest topologies of rings and trees. Thus, approximation algorithms 
are essential. 

The problem has application to Wavelength Division Multiplexing (WDM) 
optical networks [18]. Such networks consist of nodes connected with fibers. 
Connection requests are pairs of nodes to be thought of as transmitter-receiver 
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pairs. For each connection request, WDM technology routes the request through 
a transmitter-receiver path and assigns this path a wavelength, in such a way that 
paths going through the same fiber are assigned different wavelengths. Recently, 
the multifiber WDM network model was introduced [6,15,12]. In these networks 
each fiber of the standard model is replaced by k identical “parallel” fibers. 

For path coloring problems, bounds on the number of colors are usually 
expressed as a function of the load of the set of paths given as input, i.e., the 
maximum number of paths going through any edge of the graph. Erlebach et al. 
[5] present an algorithm that colors any set of paths of load L on a bidirected tree 
with at most 5L/3 colors. Auletta et al. [1] present a randomized algorithm that 
colors any set of paths of load A on a bidirected binary tree of depth with 

at most 7L/5 -I- o{L) colors, with high probability. In rings, Tucker’s algorithm 
[19] colors any set of paths of load L with 2L colors or with ]"|5|T] -I- 1 colors 
where I is the minimum number of paths necessary to cover the ring, as shown 
recently in [13,20]. The interested reader may refer to [2] for a survey on path 
coloring results motivated by WDM optical networks. 

Upper bounds of {1 + l/k)j + Ck (where Ck depends only on k) for path 
/c-coloring in rings are presented in [15,12]. The results in [5,1] can be trivially 
modified to give ]"|^] and -I- o{L/k) upper bounds for path fc-coloring in 
arbitrary and binary bidirected trees, respectively. Note that L/k is a lower 
bound on the minimum number of colors necessary to fc-color any set of paths 
of load L. Thus, by dividing the upper bound on the number of colors achieved 
by an algorithm by L/k we obtain an upper bound on its approximation ratio. 

Another approach is to design approximation path coloring algorithms which 
use optimal fractional colorings to obtain provably good approximations of the 
optimal path coloring. Given a set of paths on a graph, we may think of the path 
fc-coloring problem as the problem of covering the paths by as few as possible 
/c-independent sets of paths, i.e., sets of paths in which at most k paths share an 
edge of the graph. This can be captured by the following integer linear program 

minimize 

subject to peix(I) >lpeP 

x(/)G{0,l} l€l 

where I denotes the set of the /c-independent sets of P. This formulation has a 
natural linear programming relaxation by substituting the integrality constraint 
by x{I) > 0. The corresponding combinatorial problem is called the fractional 
(path) fc-coloring problem [3,8] and any feasible solution to the linear program 
is called a fractional fc-coloring of P. Given a set of paths P on a graph G, 
we denote by Wk{P,G) and fk{P,G) the cost of the optimal solution of the 
integer linear program and its relaxation, respectively. Alternatively, one may 
see the (fractional) path coloring problem for a set of paths P on a graph G as 
a (fractional) graph coloring problem on the conflict graph of P, i.e., the graph 
which has a node for each path of P and an edge between two nodes if the 
corresponding paths traverse the same edge on G. 

In general, fractional path coloring is hard to approximate while it can be 
approximated within a in polynomial time provided that a-approximate indepe- 
dent sets can be computed efficiently [8,9,10]. The techniques of [8,9,10] can be 
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applied to fractional path /c-colorings as well. However, they constitute general 
ways for approximating the optimal objective value of the corresponding linear 
program with an exponential number of variables while, in approximation algo- 
rithms for path coloring, a provably good solution for fractional path coloring 
(the values of the variables of the corresponding linear program) is rounded to 
an integral one which gives a path coloring. So, previous work (as well as this 
paper) seeks for formulations of fractional path coloring as a linear program with 
a polynomial number of variables. 

The work of Kumar [11] on the path coloring problem in rings (also known 
as circular arc coloring problem) uses a reduction to instances of integral mul- 
ticommodity flow due to Tucker [19]. Kumar solves the relaxation of the multi- 
commodity flow problem optimally (this is equivalent to computing the optimal 
fractional coloring almost exactly) and then performs randomized rounding [17] 
to obtain the path coloring. The resulting path coloring is proved to be within 
1.37 -|- o(l) of the optimal number of colors. 

In [3] , it is shown that the fractional path coloring can be solved in polynomial 
time in bounded-degree bidirected trees. By applying a randomized rounding 
method similar to that used by Kumar and using the algorithm of Erlebach et 
al. [5] as a subroutine, a (1.613 -I- o(l))-approximation algorithm is obtained. 

The contribution of this paper can be summarized as follows: 

— We introduce a new randomized rounding method applied to fractional path 
fc-colorings. For the analysis, we study a generalization of a classical occu- 
pancy problem which may be of interest in other applications as well. 

~ Using the randomized rounding we obtain new existential upper bounds on 
the minimum number of colors sufficient to /c-color any set of paths pro- 
vided that the cost of the optimal fractional coloring is sufficiently large and 
the dilation (i.e., the length of the longest path) is small. Existential upper 
bounds for arbitrary k are also obtained for arbitrary trees and rings. 

— We also discuss two algorithmic applications of the method. For constant 
k, we present polynomial time approximation path /c-coloring algorithms in 
bidirected trees of bounded-degree and in rings. Our algorithms improve 
existing ones provided that the load is not small. The same restriction exists 
in previous results [3,11]. For WDM networks, this is a realistic assumption. 

• We give a method which computes an almost optimal fractional k- 
coloring of a set of paths on a bounded-degree bidirected tree. For k = 1, 
this method is slightly weaker than the method in [3] but it is suitable 
for our purposes. The fractional fc-coloring is then used to perform ran- 
domized rounding and, using the algorithms in [5] and [1] as subroutines, 
we obtain (1.511 -I- o(l))- and (1.336 -I- o(l) (-approximation algorithms 
for path fc-coloring in bounded-degree and binary trees, respectively. 

• In rings, we present a reduction of path fc-coloring to instances of an 
integral constrained multicommodity flow problem, generalizing in this 
way Tucker’s reduction for A: > 1. This reduction is used for computing 
almost optimal fractional fc-colorings, which, combined with randomized 
rounding and existing algorithms [12,13,15,19,20], give better approxi- 
mation algorithms for path fc-coloring (k > 2) and for special instances 
of path coloring. 
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The strength of our randomized rounding technique is that it uses a parame- 
ter which can be adjusted according to the approximation ratio of the /c-coloring 
algorithm used as a subroutine. It can be used to give path fc-coloring algorithms 
with improved approximation ratio in any graph (directed or undirected) where 
the best known upper bound is expressed in terms of the load, provided that an 
almost optimal fractional fc-coloring can be computed efficiently. 

The rest of the paper is structured as follows. In Section 2, we present the oc- 
cupancy problem and study the behavior of related random variables. We present 
the randomized rounding technique in Section 3 together with its analysis and 
applications. We devote Section 4 to describe how to compute almost optimal 
fractional /c-colorings in bidirected trees and in rings and how to perform ran- 
domized rounding according to them. Due to lack of space, most of the proofs 
have been omitted. They will be included in the final version of the paper. 

2 An Occupancy Problem 

In this section, we study the behavior of random variables in a new occupancy 
problem which generalizes classical “balls-to-bins” processes [16]. This will be 
very useful for analyzing the performance of our randomized rounding method. 

Let fc > 1 be an integer, n > 0 be an integer multiple of k and q > 0. Consider 
the following “balls-to-bins” process. We have n/k balls and n bins. Associated 
with each ball i and each subset of bins Sj of size k is a, non-negative number 

Pij such that YhjPij = 1 for any ball i, and "^ 27=1 Pij ~ f f®'' each bin 

£. For each ball i = l,...,n/k, we toss a coin with Pr[HEADS] = q— [gj. On 
HEADS, we execute [gj -I- 1 rounds, otherwise we execute [q\ rounds. In each 
round executed for ball i, a subset of bins of size k is selected randomly among 
all possible subsets according to the probabilities pij, and one copy of ball i is 
thrown to each bin of the selected set. We denote by Q the random variable 
representing the number of empty bins after the execution of the process and by 
TZ the random variable representing the total number of rounds executed. 

Lemma 1 

a- E[Q] < 

b. For any A > 0, Pr[|Q - E[Q]\ > A] < 2exp (- 2 if^) 

c. E[TZ] = qn/k 

d. If q is not integer, then for any A > 0, Pr[7^ — E[TZ] > A] < exp 



3 The Randomized Rounding Technique 

In this section we present the randomized rounding technique. The technique is 
applied to normal sets of paths. A set of (directed) paths P on a network G is 
called normal if it has the same load on every (directed) edge of G. 

The main idea is to round a fractional fc-coloring of the set of paths P and 
obtain a fc-coloring of some of the paths of P. In particular, we use a family of 
fractional /c-coloring functions as a representation of a fractional /c-coloring. 
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Definition 2 Let P he a normal set of paths of load kZ (where Z is an integer). 
A set of non-negative weight functions Xj for j = on the k-independent 

sets of P is called a family of fractional k-coloring functions for P if 

z 

E E Xj{I) = 1, for any path p G P, and 

I&X-.p&I j=l 

= 1, for any j = l,...,Z, 
lex 

where I is the set of the k-independent sets of P. 

Observe that if a set of paths P of load kZ (where Z is an integer) on 
a graph G has a family of fractional fc-coloring functions, then it has a frac- 
tional /c-coloring of cost exactly Z since the weight function x defined as 
x{I) = ^f^iXi{I) for / G I, is a fractional /c-coloring of P of cost Z. The 
opposite also holds as the following lemma states. 

Lemma 3 Let k > 1 be an integer constant and let P he a normal set of paths 
of load kZ (where Z is integer) on a graph G. Given a fractional k-coloring x 
of P of cost Z, we can construct a family of fractional k-coloring functions yj 
for) = 1,...,Z. 

The following lemma implies that, for any set of paths, there exists a superset 
which has a family of fractional fc-coloring functions. 

Lemma 4 Let k > 0 he an integer and let P he a set of paths on a graph G. 
Consider the normal set of paths P' of load fc(l -I- \fk{P, G)]) on G obtained by 
adding single-hop paths to P. Lt is fk{P', G) = 1 -I- |"/fe(P, G)] . 

We are now ready to describe the randomized rounding technique. The tech- 
nique applies to normal sets of paths having a family of fractional fc-coloring 
functions. On input a set of paths P of load kfk{P, G) (where fk{P, G) is inte- 
ger) on a graph G, the randomized rounding technique uses a parameter q > 0 
and a family of fractional fc-coloring functions Xi, i = 1, ..., /^(P, G) for P to 
properly fc-color some of the paths of P as follows. Initially, all paths of P are 
uncolored. For each i = 1, ..., /^(P, G), randomized rounding proceeds by toss- 
ing a coin with Pr[HEADS] = q — [gj . On HEADS, it executes [qj -I- 1 rounds, 
otherwise it executes [gj rounds. In each round associated with some i, a fc- 
independent set is selected by casting a die with a face for each fc-independent 
set with Xi{L) > 0 and probability Xi{I) associated with the face corresponding 
to the fc-independent set /. At the end of the round, all the paths of the selected 
fc-independent set which are still uncolored are colored with a new color. 

In the rest of this section we will use the randomized rounding technique 
either to prove existential upper bounds on the minimum number of colors suf- 
ficient to fc-color a set of paths or to obtain polynomial time approximation 
algorithms for fc-coloring sets of paths using a provably small number of colors. 
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3.1 Existential Upper Bounds 

An upper bound of fk{P, G)(l + In(fcm)) for Wk{P, G) can be obtained by using 
the techniques of Lovasz [14] . In the following we give better upper bounds for 
Wk{P,G) provided that fk{P,G) is sufficiently large. 

Lemma 5 Let P he a set of paths on a graph G with m > 3 edges, k > 0 be an 
integer and [3 be such that 



3 > max 

P'CP 



kwk(P',G)\ 
L{P',G) J 



where L{P',G) denotes the load of the set of paths P' on G. If fk{P,G) = 
then Wk{P,G) < /fc(P, G)0(ln /3), and, if h{P,G) = 
then Wk{P, G) < fk{P,G){\ + \n (3 + o(l)). 



Proof. Let P' be the normal set of paths of load k{l + \fk{P, G)]) obtained by 
adding single-hop paths to P. By Lemma 4, it is fk{P', G) = 1 -h \fk{P, G)] and 
P' has a family of fractional fc-coloring functions Xi for i = 1, 1 -I- |"/fc(P, G)]. 
We apply randomized rounding to P' with q = In f3 using the family of fractional 
fc-coloring functions Xi. We define Z = 1+ \fk{P, G)] . 

Let TZ be the random variable denoting the number of rounds, e be an edge 
of G and Qe be the random variable representing the number of paths traversing 
e which are left uncolored after the application of randomized rounding. We may 
view the randomized rounding as a balls-to-bins process like the one described 
in Section 2. The random variable TZ corresponds to the number of rounds in the 
balls-to-bins process. The paths traversing edge e are the bins and the paths of 
the ^-independent set traversing e which are selected during a round correspond 
to copies of a ball thrown into the k corresponding bins. The probabilities on the 
sets of k bins where copies of balls are thrown in the corresponding balls-to-bins 
process are defined by the family of fractional fc-coloring functions. Thus, the 
random variable Qe corresponds to the number of empty bins in the balls-to-bins 
process. 

By Lemma 1, we obtain that E[TZ] = Z In (3 and that, for any A > 0, the 
probability that 7^ > if [7^] -I- A is at most exp By setting A = 2V Z In m, 

we have that the probability that the number of colors used during rounding 
exceeds Zln(3 + 2\/Zlnm is at most 1/m. 

Using Lemma 1, we obtain that E[Qe] < ^ and that, for any A > 0, 

the probability that Qe > E[Qe] -I- A is at most 2 exp (^— 2k^z\in p~\ ) • 
ting A = 2k^y~Z\\nJP\hirn, we have that the probability that Qe exceeds 
^ -I- is less than 2/w?. Since there are m edges in G, the 

load of the paths left uncolored after the application of the randomized round- 
ing technique is at most ^ + 2k^JZ\lnf3~\ Inm, with probability at least 1 — 2/m. 

Now, using the definition of (3, it can be easily verified that, since the set 
of paths left uncolored after rounding consists of a subset of the original set of 
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paths P and (possibly) some additional single-hop paths, it can be /c-colored 
with at most /3/fc times its load colors. 

Hence, with probability at least 1 — 3/m > 0, the total number of colors is 
at most 

Zln(3 + 2\/Zlnm + Z + 2/3\/Z[ln /3] In m. 

The proof completes by observing that if fk{P,G) = f? ^ ^ ^ (resp. 

Lo ^ ^ ^ ) , then the sum of the second and the fourth term in the above ex- 
pression is of order 0(/fc(P, G) ln/3) (resp. o(/fc(P, G) In /3)). □ 

We will apply Lemma 5 to obtain existential upper bounds for Wk{P,G) in 
general (directed or undirected graphs) and in bidirected trees. 

Theorem 6 Let P he a set of paths on a graph G with dilation D and k > 0 be 
an integer. If fk{P,G) = G (^ ^in th.en Wk{P,G) < fk{P,G)0{lnD), and, if 

/fc(P,G) = w(^^^), then Wk{P,G) < fk{P,G){l + lnD + o{l)). 

Proof. Observe that the conflict graph of any set of paths of dilation D and 
load L has degree at most D{L — 1) and, hence, can be fc-colored with at most 
colors. The proof completes by applying Lemma 5 with j3 = D. □ 

Theorem 7 Let k > 0 be an integer and P he a set of paths of load w(fclnm) 
on a bidirected tree T with m directed edges. It holds that Wk{P,T) < (1.511 -I- 
o{l))MP,T). 

Proof. Erlebach et al. [5] present an algorithm which colors any set of paths of 
load L on a bidirected tree with at most 5L/3 colors. Clearly, it can be slightly 
modifled to fc-color any set of paths of load L with at most |"|^] colors. Thus, 
we may apply Lemma 5 with /3 = | -|- (observe that the lower bound on the 
load implies that fk{P,T) = a;(lnm)) and obtain the desired bound. □ 



3.2 Algorithmic Applications 

Observe that the path /c-coloring algorithm we used in the proof of Lemma 5 
would run in polynomial time on input a set of paths P on a graph G if (1) a 
normal superset P' of P of load k\fk{P, G)] on G can be computed in polynomial 
time, (2) die-casting according to a family of fractional fc-coloring functions 
implied by the fractional fc-coloring of P' can be performed in polynomial time, 
and (3) for any set of paths P, a fc-coloring of the paths in P with at most (3/k 
times the load of P colors can be computed in polynomial time. Although in both 
Theorems 6 and 7 property (3) is guaranteed by a polynomial time algorithm, 
(1) and (2) are infeasible in general unless P = NP. This is due to the fact that 
fractional path coloring is as hard to approximate as fractional graph coloring 
(it is easy to see that for any graph H, we can construct a set of paths on a 
graph G having H as its conflict graph) which, in turn, is almost as hard to 
approximate as graph coloring [8,14]. Moreover, a family of fractional fc-coloring 
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functions Xi may have Xi{I) > 0 for exponentially many ^-independent sets of 
P. 

In Section 4, given a set of paths P on a graph G which is either a bidirected 
tree of bounded degree or a ring, we show how to construct a normal superset 
of P of load kZ having a fractional fc-coloring of integer cost Z < 1 -h |"/fc(P, G)] 
and how to perform die-casting according to a family of fractional fc-coloring 
functions implied by this fractional fc-coloring, both in polynomial time when 
fc > 0 is an integer constant. 

For bidirected trees of bounded degree, following this approach, applying 
randomized rounding with g = In | « 0.511, and using the algorithm of Erlebach 
et al. [5] to fc-color the paths left uncolored after the application of randomized 
rounding, we obtain the following result. 

Theorem 8 Let k > 1 be an integer constant. There exists a polynomial-time 
algorithm which, on input a set of paths P of load w(lnm) on a hounded- degree 
bidirected tree with m directed edges, computes a {1.511 -\- o{l))- approximate k- 
coloring of P, with high probability. 

For sets of paths of load L on binary trees of depth there exists a 

randomized algorithm that colors them using at most 7L/5 -\- o{L) colors, with 
high probability [1]. Thus, we may follow the same approach used for bounded- 
degree trees, apply randomized rounding with g = In | « 0.336, and use this 
randomized algorithm to fc-color the paths left uncolored to obtain the following 
result. 



Theorem 9 Let k > 1 he an integer constant. There exists a polynomial-time 
algorithm which, on input a set of paths P of load w(lnm) on a binary bidirected 
tree with m directed edges and of depth computes a (1.336 -I- o(l))- 

approximate k-coloring of P, with high probability. 



We now present an improved approximation for some instances of the path 
coloring problem in rings. On input a set of paths P on a ring, we use randomized 
rounding with g = In where I is the minimum number of paths of P necessary 
to cover the ring, and Tucker’s algorithm [19] to color the paths left uncolored 
after randomized rounding. Li and Simha [13] and, independently, Valencia- 



Pabon [20] show that Tucker’s algorithm colors P with at most 
colors. We obtain the following result. 




-k 1 



Theorem 10 There exists a polynomial-time algorithm which, on input a set of 
paths P of load w(ln m) on a ring with m edges, computes a ^1 -I- In -\- o(l)^ - 
approximate coloring of P, with high probability, where I is the minimum number 
of paths in P necessary to cover the ring. 



For sets of paths with I > 5, the approximation ratio of our algorithm is 
better than the approximation ratio of the algorithms in [11], [20], and [13]. 

We can also improve the best known approximation ratio for fc-coloring of 
sets of paths in rings by using randomized rounding with q = ln(l -|- 1/fc) and 
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an algorithm presented in [15,12] to complete the /c-coloring. This algorithm k- 
colors any set of paths of load T on a ring using at most (l + ^)f + Cfc colors 
(where Ck may depend on k). We obtain the following result. 

Theorem 11 Let k > 2 he an integer constant. There exists a polynomial-time 
algorithm which, on input a set of paths P of load w(lnm) on a ring with m 
edges, computes a {1 1/k) o{l))~ approximate k-coloring of P, with 
high probability. 

4 Computing Families of Fractional fc-Coloring Functions 

In this section, given a set of paths P on a bounded-degree tree or on a ring, we 
show how to compute normal supersets of P of load kZ which have a fractional 
/c-coloring of integer cost Z < 1 -|- \fk{P, G)] . 

In both cases, we follow the same augmentation procedure. Starting with a 
set of paths P of load L on a network G, we construct a normal superset Pq of 
P having load the first multiple of k greater or equal to L (i.e., k\L/k~\). This 
is done by adding single-hop paths traversing the edges of the tree which are 
not fully loaded. We run a procedure called checker on the set of paths Pq. The 
checker returns YES if the set of paths taken as input has a fractional fc-coloring 
of cost equal to its load over k; it returns NO otherwise. If the checker returns 
NO, we continue this procedure for i = I, 2, ..., by constructing a normal superset 
Pi of P of load k{i -\- \ L / k~\) and running the checker on Pi, until it returns YES. 

By Lemma 4, we know that the augmentation procedure terminates after at 
most 2-1- \fk{P,G)~\ — \L/k'\ executions of the checker. Clearly, \fk{P,G)~\ is 
polynomial in L and the size of the graph. Furthermore, the load of the set of 
paths given as input to the checker in each execution is also polynomial in L and 
the size of the graph. In what follows, we will describe how the checker works 
in bounded-degree bidirected trees and in rings and we will claim that it runs 
in polynomial time in terms of the load of the set of paths taken as input and 
the size of the graph. As a result, we will obtain that the whole augmentation 
procedure runs in polynomial time. In both cases, we can also show how to use 
the fractional /c-coloring computed during the last execution of the checker to 
perform die-casting in polynomial time according to a family of fractional k- 
coloring functions implied by this fractional fc-coloring. Due to lack of space, 
formal proofs have been omitted. They will be included in the final version of 
the paper. 



4.1 Bidirected Trees 

In this section, we will describe the checker TREE-fc-CHECKER for checking 
whether a normal set of paths P of load L which is a multiple of A: on a bidirected 
tree T has a fractional fc-coloring of cost L/k. 

Given a non-leaf node v of the tree, consider the subset Py of P containing 
the paths that touch node v. We denote by T{Py) the set of all /c-independent 
sets of paths of Py which have full load k on each directed edge adjacent to v. 
TREE-/c-CHECKER constructs the linear program described in the following: 
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The linear program has a non-negative weight x{I) for each k- 
independent set of X{Py), for any non-leaf node v of the tree. The ob- 
jective is to maximize the sum of the weights of the fc-independent sets 
of I{Pr), where r is a specific non-leaf node of T. There are constraints 
of two types. The first type of constraints is that, for each path p G P 
and for each non-leaf node v it touches, the sum of the weights of the 
fc-independent sets of X{Py) it belongs to is constrained to be at most 1. 

The second type of contraints is that, for any pair of adjacent non-leaf 
nodes v and u of the tree, and any set of k paths pi, ...,pfc traversing the 
directed edge (v,u) and any set of k paths q\,...,qk traversing the op- 
posite directed edge (u, v), the sum of the weights of the fc-independent 
sets of X{Pu) that contain pi, ...,p^, gi, is constrained to be equal 
to the sum of weights of the /c-independent sets of X{Py) that contain 

Pi,...,Pfc,gi,...gfc. 

TREE-fc-CHECKER solves the above linear program and returns YES if it 
has a solution of cost L/k. Otherwise, it returns NO. 

Lemma 12 Let k > 0 be an integer constant. On input a normal set of paths P 
of load L which is a multiple ofk on a bidirected tree T of bounded degree, TREE- 
k-CHECKER runs in polynomial time and returns YES iff P has a fractional 
k-coloring of cost L/k. 

Now consider the application of the augmentation procedure on the orig- 
inal set of paths P of load L on the tree T using TREE-fc-CHECKER as 
checker. We denote by Pz-fL/k] the normal set of paths of load kZ (where 
Z is an integer) produced when the augmentation procedure terminates. By 
the definition of the augmentation procedure and by Lemma 12, it is clear that 
= \fk{Pz-\L/k-],T)~\ which, by Lemma 4, is at most 1 -k \fk{P,T)~\. 

When the augmentation procedure terminates we use the solution of the 
linear program to implicitly build a family of fractional /c-coloring functions 
and perform die-casting according them. We can show that this can be done in 
polynomial time. 



4.2 Rings 

In this section, we describe the checker RING-Zc-CHECKER. It receives as input 
a normal set of paths P of load L which is a multiple of A: on a ring C with m 
edges and checks whether P has a fractional /c-coloring of cost L/k. 

We denote by cq, ei, ..., Cm-i the edges of the ring C (edges Cj and e^+i mod m 
are consecutive), by Pe, the subset of P consisted of the paths of P traversing 
edge Ci, and by X{Pf,f) the set of all subsets of Pg^ of size k. Note that each 
set of paths in X{Pgf) is a fc-independent set. RING-fc-CHECKER considers the 
following multicommodity flow network H{P, C). 



The network has m -I- 1 levels of nodes. Levels 0, ...,m — 1 correspond 
to the edges eo, ei, ..., Cm-i of the ring C while level m corresponds to 
edge Co of C as well. In each of these levels corresponding to the edge e^. 
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the network has N = (^) nodes; one node per each ^-independent set of 
For each node u of level i <m corresponding to a fc-independent 
set I, we define the forward set of u to be the set of paths of I which 
traverse edge e^+i mod m- For each node u of level t > 0 corresponding to 
a fc-independent set I, we define the backward set of u to be the set of 
paths of I which traverse edge Ci_i. The network H{P, C) has a directed 
edge from a node u of level f to a node v of level i -I- 1 iff the forward 
set of u is the same with the backward set of v. The network H{P,C) 
has N commodities. Each node of level 0 is the source of a commodity. 

The sink for each commodity is located at the node of level m which 
corresponds to the same /c-independent set of T(f’eo) with its source. 

RING-fc-CHECKER solves the maximum multicommodity flow problem on 
the network H{P, C) under the constraint that for each path p of P, and for each 
edge 6i traversed by p, the total flow entering (leaving) all the nodes of H{P, C) 
of level i corresponding to fc-independent sets of T(T’eJ that contain the path 
p is at most 1. RING-fc-CHECKER returns YES if there is a total flow of size 
Lik. Otherwise, it returns NO. 

Lemma 13 Let k > 0 be an integer eonstant. On input a normal set of paths 
P of load L which is a multiple of k on a ring C, RING-k-CHECKER runs in 
polynomial time and returns YES iff P has a fractional k-coloring of cost L/k. 

Now consider the application of the augmentation procedure on the orig- 
inal set of paths P of load L on the ring C using RING-fc-GHEGKER as 
checker. We denote by Pz-\L/k'\ the normal set of paths of load kZ (where 
Z is an integer) produced when the augmentation procedure terminates. By 
the definition of the augmentation procedure and by Lemma 13, it is clear that 
= \fk{Pz-\L/k-],Cy\ which, by Lemma 4, is at most 1 -k |"/fc(P, C)]. 

When the augmentation procedure terminates, we use the solution to the 
multicommodity flow problem on the network H{Pz-\L/k ] ) O) to obtain a frac- 
tional /c-coloring x. This is done by decomposing the flow for each commodity 
on P[{Pz-\L/k'\)C) into flow paths, mapping the flow paths into /c-independent 
sets of Pz-\L/k]j Enid assigning to each of these fc-independent sets I weight x{I) 
equal to the flow carried by the corresponding flow path. Using x, we can obtain a 
family of fractional /c-coloring functions pj for Pz-\L/k'\ and perform die-casting 
according them. Again, we can show that this can be done in polynomial time. 
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Abstract. The combinatorial core of the OVSF code assignment prob- 
lem that arises in UMTS is to assign some nodes of a complete binary 
tree of height h (the code tree) to n simultaneous connections, such that 
no two assigned nodes (codes) are on the same root-to-leaf path. Each 
connection requires a code on a specified level. The code can change over 
time as long as it is still on the same level. We consider the one-step code 
assignment problem: Given an assignment, move the minimum number of 
codes to serve a new request. Minn and Sin proposed the so-called DCA- 
algorithm to solve the problem optimally. We show that DCA does not 
always return an optimal solution, and that the problem is AP-hard. 
We give an exact n'^^-^^-time algorithm, and a polynomial time greedy 
algorithm that achieves approximation ratio &(h). Finally, we consider 
the online code assignment problem for which we derive several results. 



1 Introduction 

Recently UMTS^ has received a lot of attention, and also raised new algorithmic 
problems. In this paper we focus on a specific aspect of its air interface W- 
CDMA^ that turns out to be algorithmically interesting, more precisely on its 
multiple access method DS-CDMA.^ The purpose of this access method is to 
enable all users in one cell to share the common resource, i.e. the bandwidth. In 
DS-CDMA this is accomplished by a spreading and scrambling operation. Here 
we are interested in the spreading operation that spreads the signal and separates 
the transmissions from the base-station to the different users. More precisely, we 
consider spreading by Orthogonal Variable Spreading Factor (OVSF-) codes [1, 
14]. These codes are derived from a code tree. The OVSF-code tree is a complete 
binary tree of height h that is constructed in the following way: The root is 
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labeled with the vector (1), the left child of a node labeled a is labeled with (a, a), 
and the right child with (a,— a). Each user in one cell is assigned a different 
OVSF-code. The key property that separates the signals sent to the users is 
the mutual orthogonality of the users’ codes. All assigned codes are mutually 
orthogonal if and only if there is at most one assigned code on each leaf-to-root 
path. In DS-CDMA users request different data rates and get OVSF-codes of 
different levels. (The data rate is inversely proportional to the length of the 
code.) In particular, it is irrelevant which code on a level a user gets, as long 
as all assigned codes are mutually orthogonal. We say that an assigned code in 
a node in the tree blocks all codes in the subtree below it and all codes on the 
path to the root, see Fig. 1 for an illustration. 




Fig. 1. A code assignment and blocked 
codes 



level request for code on level 2 




Fig. 2. A code insertion on level 2 into 
a single code tree T 



As users connect to and disconnect from a given base station, i.e. request 
and release codes, the code tree can get fragmented. Then it can happen that a 
code request for a higher level cannot be served at all, because lower level codes 
block all codes on this level. For example in Fig. 1 no code can be inserted on 
level 2 without reassigning another code, even though there is enough available 
bandwidth. This problem is known as code blocking or code-tree fragmentation 
[17,18]. One way of solving this problem is to reassign some codes in the tree 
(more precisely, to assign alternative OVSF-codes of the same level to some users 
in the cell). In Fig. 2 some user requests a code on level two, where all codes 
are blocked. Still, after reassigning some of the assigned codes as indicated by 
the dashed arrows, the request can be served. Here and in many of the following 
figures we only depict the relevant parts (subtrees) of the single code tree. 

The process of reassigning codes necessarily induces signaling overhead from 
the base station to the users whose codes change. This overhead should be kept 
small. Therefore, a natural objective already stated in [18,19] is to serve all code 
requests as long as this is possible, while keeping the number of reassignments as 
small as possible. (In fact, as long as the bandwidth of all simultaneously active 
code requests does not exceed the total available bandwidth, it is always possible 
to serve them.) The problem has been studied before with focus on simulations. 
In [18] the problem of reassigning the codes for a single additional request is 
introduced. The Dynamic Code Assignment (DCA) algorithm is presented and 
claimed to be optimal. In this paper we prove that this algorithm is not always 
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optimal and analyze natural versions of the underlying code assignment (CA) 
problem. Our intention is to present a first rigorous analysis of this problem. 

First, we give a counterexample to the optimality of the DCA-algorithm in 
Sect. 2, then prove the original problem stated by Minn and Siu to be AP-hard 
for a natural input encoding in Sect. 3. In Sect. 4 we give a dynamic program- 
ming algorithm that solves the problem with running time where n is 

the number of assigned codes in the tree. In Sect. 5 we give a sketch of an in- 
volved analysis showing that a natural greedy algorithm already mentioned in 
[18] achieves approximation ratio h for one step. Finally, we tackle the online- 
problem in Sect. 6, which is a more natural version of the problem. We present 
a 0(/i) -competitive algorithm and show that the greedy strategy that minimizes 
the number of reassignments in every step is not better than l7(/i)-competitive. 
We also give an online-algorithm with constant competitive ratio that uses re- 
source augmentation, i.e. we give it one more level than the adversary. Details 
omitted in this paper can be found in [9]. 

1.1 Problem Definition 

We consider the combinatorial problem of assigning codes to users. The codes 
are the nodes of an (OVSF-) code tree T = (V, E). Here T is a complete binary 
tree of height h. The set of all users using a code at a given moment in time can 
be modelled by a request vector r = (rg . . . ru) G where ri is the number of 

users requesting a code on level i (with bandwidth 2*). The levels of the tree are 
counted from the leaves to the root starting at level 0. We denote by l{v) the level 
of node v. Each request is assigned to a node in the tree, such that for all levels 
i € {0 . . . /i} there are exactly r* codes on level i, and on every path pj from a leaf 
j to the root there is at most one code assigned. We call every set of positions 
F CV in the tree T that fulfills these properties a code assignment. For ease of 
presentation we also call F the set of codes. Throughout this paper, a code tree 
is the tree together with a code assignment F. If a user connects to the base 
station, the resulting additional request for a code represents a code insertion 
(on a given level). If some user disconnects, this represents a deletion (in a given 
position). A new request is dropped if it cannot be served. This is the case, if its 
acceptance would exceed the total bandwidth. By N we denote the number of 
leaves of T and by n the number of assigned codes |F|. After an insertion on level 
It at time t, any CA-algorithm must change the code assignment Ft for request 
vector r into Fi+i for the new request vector r' = (rg, . . . , -|- 1, . . . , ru). The 

size jF(+i \ Ft\ corresponds to the number of reassignments. Therefore, for an 
insertion, the new assignment is counted as a reassignment. We define the cost 
function as the number of reassignments. Deletions are not considered in the cost 
function. When we want to emphasize the combinatorial side of the problem we 
call a reassignment a movement of a code. We state the original CA problem 
studied by Minn and Siu together with some of its natural variants: 

One-step offline CA. Given a code assignment F for a request vector r and 
a code request for level 1. Find a code assignment F' for the new request 
vector r' = (rg, . . . , n -I- 1, . . . , Vh) with minimum number of reassignments. 
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General offline CA. Given a sequence S of code insertions and deletions of 
length TO. Find a sequence of code assignments so that the total 

number of reassignments is minimum, assuming the initial code tree is empty. 

Online CA. This is the same problem as the general offline CA, except that 
the future requests of S are not known in advance. 



1.2 Related Work 

It was a paper by Minn and Siu [18] that originally drew our attention to this 
problem. There the one-step offline version is defined together with an algorithm 
that is claimed to solve it optimally. As we show in Sect. 2 this claim is not 
correct. Many of the follow-up papers like [3,5,6,11,12,16,19] acknowledge the 
original problem to be solved by Minn and Siu and study some other aspects of 
it. Assarut et al. [3] evaluate the performance of Minn and Siu’s DCA-algorithm, 
and compare it to other schemes. Moreover, a different algorithm is proposed for 
a more restricted setting in [2]. Other authors use additional mechanisms like 
time multiplexing or code sharing on top of the original problem setting in order 
to mitigate the code blocking problem [5,19]. Dell’Amico et al. [8] present a tree 
partitioning policy resembling the compact representation algorithm of Sect. 6. 
A different direction is to use a heuristic approach that tackles the problem for 
small input instances [5]. Kam, Minn and Siu [16] address the problem in the 
context of bursty traffic and different QoS.^ They present a notion of “fairness” 
and also propose to use multiplexing. Priority based schemes for different QoS 
classes can be found in [7], similar in perspective are [11,12]. 

Fantacci and Nannicini [10] are among the first to express the problem in 
its online version, although they have quite a different focus. They present a 
scheme that is similar to the compact-representation scheme in Sect. 6, without 
focusing on the number of reassignments. Rouskas and Skoutas [19] propose a 
greedy online-algorithm that minimizes in each step the number of additionally 
blocked codes, and provide simulation results but no analysis. 



2 Non-optimality of Greedy Algorithms 

Here we look at possible greedy algorithms for the one-step offline CA. A straight- 
forward greedy approach is to select for a code insertion a subtree with minimum 
cost (that is not blocked by a code above the requested level) , according to some 
cost function. All codes in the selected subtree must then be reassigned. So in 
every step a top-down greedy algorithm chooses the maximum bandwidth code 
that has to be reassigned, places it at the root of a minimum cost subtree, takes 
out the codes in that subtree and proceeds recursively. The DCA-algorithm in 
[18] works in this way. The authors propose different cost functions, among which 
the “topology search” cost function is claimed to solve the one-step offline CA 
optimally. Here we show the following theorem: 

Quality of Service 
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Theorem 1 . Any top-down greedy algorithm Atdg whose cost function depends 
only on the current assignment of the considered subtree is not optimal. 

As all proposed cost functions in [18] depend only on the current assignment 
of the considered subtree, this theorem implies the non-optimality of the DCA- 
algorithm. 

Proof. Our construction considers the subtrees in Fig. 3 and the assignment of 
a new code to the root of the tree Tq. The tree Tg has a code with bandwidth 2 k 
on level I and depending on the cost function has or does not have a code with 
bandwidth k on level I — 1. The subtree Ti contains k — 1 consecutive codes at 

level T„ OPT — 




Fig. 3. Example for the proof of Thm. 1 



leaf level and the rest of the subtree is empty. The subtrees T 2 and T 3 contain k 
codes at leaf level interleaved with k free leaves. All other subtrees, in particular, 
the sibling trees of Ti , T 2 and Tg (omitted from the figure) have all the leaves 
assigned. This pairing rules out all cost functions that do not put the initial code 
at the root of Tg. We are left with two cases: 

Case 1: The cost function evaluates T 2 and T 3 as cheaper than Ti. In this case 
we let the subtree Tg contain only the code with bandwidth 2k. Algorithm 
Atdg reassigns the code with bandwidth 2k to the root of the subtree T 2 or 
Tg, which causes one more reassignment than assigning it to the root of Tg, 
hence the algorithm fails to produce the optimal solution. 

Case 2: The cost function evaluates Tg as cheaper than T 2 and Tg. In this case 
we let the subtree Tg have both codes. Atdg moves the code with bandwidth 
2k to the root of Tg and the code with bandwidth k into the tree T 2 or 
Tg, see solid lines in Fig. 3. The number of reassigned codes is 3fc/2 -|- 2. 
But the minimum number of reassignments is fc-l- 3, achieved when the code 
with bandwidth k is moved into the empty part of Tg and the code with 
bandwidth 2k is moved to the root of T 2 or T 3 , see dashed lines in Fig. 3. 

□ 



3 A^P-Hardness of One-Step Offline CA 

We prove the decision variant of the one-step offline CA to be AT-complete. It 
asks, if a new insertion can be handled with cost less or equal to Cmax) which 




An Algorithmic View on OVSF Code Assignment 275 



is part of the input. Trivially, the decision variant is in NP. AP-completeness is 
established by a reduction from the three-dimensional matching problem. 

Problem 1 (3DM). Given a set M C W x X xY , where W, X and Y are disjoint 
sets having the same number q of elements. Does M contain a matching, i.e., 
a subset M' C M such that |M'| = q and no two elements of M' agree in any 
coordinate? [13] 

Let us index the elements of the ground sets W, X, Y from 1 to q. We introduce 
the indicator vector of a triplet {wi, Xj,yk) as a zero-one vector of length 3g that 
is all zero except at the indices i,q + j and 2q + k. The idea of the reduction is to 
see the triplets as such indicator vectors and to observe that the problem 3DM 
is equivalent to finding a subset of q indicator vectors from M that sum up to 
the all-one vector. 




Fig. 4. Sketch of the construction 



Figure 4 shows an outline of the construction that we use for the reduction. 
An input to 3DM is transformed into an initial feasible assignment that consists 
of a token tree on the left side and different smaller trees on the right. The 
construction is set up in such a way that the code insertion forces the q codes in 
the token tree to move to the right side. Then these codes must be assigned to 
the roots of some triplet trees. The choice of the q triplet trees reflects the choice 
of the corresponding triplets of a matching. All codes in the chosen triplet trees 
And a place without any additional reassignment, if and only if these triplets 
represent a 3D matching. 

The token tree consists of q codes positioned arbitrarily on level Istart with 
sufficient depth. The triplet trees have their roots on the same level Istart- They 
are constructed from the indicator vectors of the triplets. For each of the iq 
positions of the vector such a tree has four levels together called a layer that 
encode either zero or one, where the encodings of zero and one are shown in Fig. 
5(a) and (b). Figure 5(c) shows how layers are stacked using sibling trees. 

The receiver trees are supposed to receive all codes moved out of triplet trees. 
We construct them such that they can absorb from every layer exactly one one- 
tree and q — 1 zero trees and the sibling trees. The codes fit exactly in the free 
positions, iff the chosen triplets form a 3DM. The exact proof of this statement 
together with the details of the construction can be found in [9] . 
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(c) A layer, consist- 
ing of a one-tree and 
its sibling 




Fig. 5. Encoding of zero and one 



An interesting question is, whether the transformation from 3DM to the 
one-step offline CA can be done in polynomial time. This depends on the input 
encoding of our problem. Two encodings seem natural: a zero-one vector that 
specifies for every node of the tree whether there is a code or not; or alternatively 
a sparse representation of the tree, consisting only of the positions of the assigned 
codes. Obviously, the transformation cannot be done in polynomial time for the 
first input encoding, because the generated tree has leaves. For the second 
input encoding the transformation is polynomial, because the total number of 
generated codes is polynomial in q, which is polynomial in the input size of 3DM. 
The discussion in this section leads to the following theorem. Its complete proof 
can be found in [9]. 

Theorem 2. The decision variant of the one-step offline CA is NP-complete for 
an input given by a list of positions of the assigned codes and the code insertion 
level. 



4 Exact Dynamic Programming Algorithm 

In this section we solve the one-step offline CA problem optimally using a dy- 
namic programming approach. The key idea of the algorithm is to store the right 
information in the nodes of the tree and to build it up in a bottom-up fashion. 

We define the signature of a subtree T„ with root u as a /(f) -I- 1-dimensional 
vector s'" = (sg, . . . , in which s'" is the number of codes in T„ on level i. A 

signature s is feasible if there exists a subtree T„ with a code assignment that 
has signature s. The information stored in every node v of the tree consists of a 
table, in which all possible feasible signatures of an arbitrary tree of height l{v) 
are stored together with their cost for T„. Here the cost of such a signature s 
for Ty (usually s s’') is defined as the minimum number of codes in Ty that 
have to move away from their old position in order to attain some tree Ty with 
signature s. To attain Ty it might be necessary to move codes also into Ty from 
other subtrees but we do not count these movements for the cost of s for Ty. 

Given a code tree T with all these tables computed, one can compute the 
cost of any single code insertion from the table at the root node r: Let s" = 
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(sq, . . . ,sj^) be the signature of the whole code tree before insertion, then the 
cost of the insertion on level I is the cost of the signature (sq, . . . , s[ + 1, . . . , sJJ) 
in this table plus one. This follows because the minimum number of codes that 
are moved away from their positions in T is equal to the number of reassignments 
minus one. 

The computation of the tables starts at the leaf level, where the cost of the 
one-dimensional signatures is trivially defined. At any node v of level l{v) the 
cost c(v, s) of signature s for T„ is computed from the cost incurred in the left 
subtree Ti of v plus the cost incurred in the right subtree Tr plus the cost at v. 
The costs c{l, s') and c(r, s”) in the subtrees come from two feasible signatures 
with the property s = (sg -f Sg, . . . , sj(„)). Any pair (s', s") of 

such signatures corresponds to a possible configuration after the code insertion. 
The best pair for node v gives c{v, s). Let s" = (sq, . . . , be the signature 
of Ty, then it holds that 

r c(Z, (0, . . . , 0)) -h c(r, (0, . . . , 0)) for s,(„) = 1 

c(ti, s) = < min{s._s/.|(s/_o)+(s",o)=s}(c(Z,s') -h c(r,s")) for s;(„) = 0,s^(„) = 0 

I 1 for si(y) = 0, s"(„) = 1. 

The costs of all signatures s for v can be calculated simultaneously by combining 

the two tables in the left and right children of v. Observe for the running time 
that the number of relevant feasible signatures is bounded by {n+ 1)^ because 
there cannot be more than n codes on any level. The time to combine two tables 
is 0{h ■ ri?^), thus the total running time is bounded by 0{2^h • v?^). 

Theorem 3. The one-step ojfline CA can he solved optimally in time 0{2^h ■ 
and space 0{2^ ■ n^). 

5 An h,- Approximation for One-Step Offline CA 

We analyze a greedy algorithm Ag for one-step offline CA based on the greedy 
strategy from Sect. 2, where the cost function of the considered subtree is the 
number of codes in it. The details of its analysis can be found in [9]. 

We are interested in the approximation ratio of Ag. Let opt denote the num- 
ber of reassigned codes of an optimal algorithm Aypt- For the upper bound we 
compare Ag to Aopt- Let us call the set of subtrees to the root of which Aypt 
moves codes Topti and the arcs that show how Agpt moves the codes the opt-arcs. 
A sketch of the proof is as follows. First, we show that in every step t Ag has the 
possibility to assign the codes Ct that remain to be reassigned into Topt- This 
possibility can be expressed by a code mapping (j)t \ Ct ^ Topt ■ The key-property 
is that in every step there is the theoretical choice to complete the current as- 
signment using the code mapping cj) and the opt-arcs as follows: Use <j) to assign 
the codes in Ct into positions in Tpt and then use the opt-arcs to move codes 
out of the subtrees of Topt to produce a feasible code assignment. This property 
is enough to ensure that Ag incurs a cost of no more than opt on each level. 

For the lower bound there is an example, where the optimal assignment for 
level I chooses a subtree with 3 leaf codes, whereas Ag always chooses a subtree 
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with 2 codes on level l—l. Doing this recursively for every level, we get the lower 
bound Q{h) for the approximation ratio. 

Theorem 4. The algorithm Ag has approximation ratio h. 

6 Online Code Assignment 

Here we study the online CA problem. In an online problem the input is received 
in an online manner and the output must be produced online [4]. In the case 
of the online CA problem the requests for code insertions and deletions must 
be handled one after another, i.e., the ith request must be served before the 
i + 1st request is known. An online algorithm ALG for the CA problem is c- 
competitive if for all finite input sequences I, ALG(I) < c- OPT(I). In this case 
the competitive ratio of ALG is c. 

We give a lower bound on the competitive ratio, analyze an 0(ft,)-competitive 
algorithm and present a resource augmented algorithm with constant competi- 
tive ratio. 

Theorem 5. No deterministic algorithm A for the online CA problem can be 
better than 1.5 -competitive. 

Proof. Let A be any deterministic algorithm for the problem. Consider N leaf 
insertions. The adversary deletes every other code. Then a code insertion on 
level ft. — 1 causes A^/4 code reassignments. We proceed with the subtree of full 
leaf codes recursively and repeat this process log 2 — 1 times. The optimal 
algorithm Aopt assigns the leaves in such a way that it does not need any extra 
reassignment at all. Thus, Aopt needs N log 2 A^ — 1 reassignments, whereas 
algorithm A needs 3N/2 -|- log 2 N — 2 reassignments. The cost ratio tends to 1.5 
as N goes to infinity. □ 

One can show that all greedy algorithms for the online problem that mini- 
mize the number of code reassignments for every insertion/deletion individually 
are G(ft)-competitive [9]. In the following we show an algorithm that achieves 
competitive ratio 0(ft) and is easy to implement. 

6.1 Compact Representation Algorithm 

The algorithm Acompact keeps the codes in the tree T ordered from left to right 
by increasing level of the codes and compact (i.e., no code can be shifted to the 
left without violating the order constraint). 

In the following we show that Acompact is ft(ft)-competitive. To see this, con- 
sider an arbitrary insertion on level I (deletions are handled similarly). Acompact 
inserts the code in the position on level I that is the first after all assigned codes 
on levels 0, . . . , ft This position can be blocked by at most one code from above. 
Acompact takes the code from this position away and assigns it recursively. At 
most one code per level is reassigned (see Fig. 6). 

For the lower bound we consider an example with 2 leaf codes and one code 
on every higher level. Deleting the level ft — 1 code and inserting a code on level 0, 
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Fig. 6. Reassignment strategy of Algorithm Acompact 



the algorithm has to move every non leaf code to the right, i.e., it moves h — 2 
codes. Deleting the 3rd leaf code and inserting the level h — 1 code, the algorithm 
has to move all the codes back. Thus the following holds. 

Theorem 6. Algorithm Acompact is 9 ih)- competitive. 



6.2 Resource Augmented Online Algorithm 

Here we present the online-strategy 3-gap and study it by a resource-augmented 
competitive analysis. This type of analysis was introduced in 1995 by Kalyana- 
sundaram and Pruhs [15]. In a resource-augmented competitive analysis one 
compares the value of the solution found by the online algorithm when it is pro- 
vided with more resources to the value of the optimal offline adversary using the 
original resources. In the case of the OVSF online code assignment problem the 
resource is the total assignable bandwidth. The strategy 3-gap uses a tree T' of 
bandwidth 2b to accommodate codes whose total bandwidth is b. By the nature 
of the code assignment we cannot add a smaller amount of additional resource. 

A gap is a subtree with no code assigned in it and above it. Algorithm 3- gap 
is similar to the compact representation algorithm of Sect. 6.1 (insisting on the 
ordering of codes according to their level) , only that it allows for up to 3 gaps at 
each level I (instead of only one for aligning), to the right of the assigned codes 
on 1. The algorithm for inserting a code at level I is to place it at the leftmost 
gap of Z. If no such gap exists, we reassign the leftmost code of the next higher 
level I -1- 1, creating 2 gaps (one of them is filled immediately by the new code) 
at 1. We repeat this procedure toward the root. We reject an insertion if the 
nominal bandwidth b is exceeded. For deleting a code c on level I we reassign 
the rightmost code on level I to c, keeping all codes at level I left of the gaps 
of /. If this results in 4 consecutive gaps, we reassign the rightmost code of / -I- 1, 
in effect replacing two gaps of I by one of Z -|- 1. Again we proceed toward the 
root. More precisely, we keep for every level a range of codes (and gaps) that are 
assigned to this level. In every range there are at most 3 gaps allowed. If we run 
out of space or if there are too many gaps, we move the boundary between two 
consecutive levels, affecting two places on the lower level and one on the upper 
level. This notion of a range is in particular important for levels without codes. 
The levels close to the root are handled differently, to avoid an excessive space 
usage. The root-code of T' has bandwidth 2b, it is never used. The bandwidth b 
code can only be used if no other code is used, there is no interaction with other 
codes. The b/2 codes are kept compactly to the right. In general there is some 
unused bandwidth between the 6/4 and the 6/2 codes, which is not considered a 
gap. For all other levels (< 6/8 codes) we define a potential-function by counting 
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the number of levels without gaps, and the number of levels having 3 gaps, and 
adding these numbers. With this potential function it is clear that it is sufficient 
to charge two (re-)assignments to every insertion or deletion, one for placing the 
code (filling the gap), and one for the potential function or for moving a 6/4- 
bandwidth code. The initial configuration is the empty tree, where the leaf-level 
has two gaps, and all other levels have precisely one gap (only the close-to-root 
levels are as described above). 

It remains to show that our algorithm manages to host codes as long as 
the total bandwidth used does not exceed 6. To do this, we calculate the band- 
width wasted by gaps, which is at most 3(| -I- H ) < 36/4. Hence the total 

bandwidth used in T' is at most 76/4 < 26. 

Theorem 7. Let a be a sequence of m code insertions and deletions for a code- 
tree of height h, such that at no time the bandwidth is exceeded. Then the above 
online- strategy uses a code-tree of height h -\- 1 and performs at most 2m code 
assignments and reassignments. 

Corollary 1. The above strategy is ^-competitive for resource augmentation by 
a factor of 2. 

Proof. Any sequence of m operations contains at least m/2 insert operations. 
Hence the optimal offline solution needs at least m/2 assignments, and the above 
resource augmented online-algorithm uses at most 2m (re-)assignments, leading 
to a competitive factor of 4. □ 

This approach might prove to be useful in practice, particularly if the code 
insertions only use half the available bandwidth. 

7 Conclusions and Future Work 

In this paper we bring an algorithmically interesting problem from the mobile 
telecommunications field closer to the theoretical computer science community. 
We are the first to analyze the computational complexity of the OVSF code 
assignment problem. Future research on CA could concentrate on the following 
open problems. 

~ Is there a constant approximation algorithm for the one-step offline CA? 

— Can the gap between the lower bound of 1.5 and the upper bound of 0{h) 
for the competitive ratio of the online CA be closed? 

— Can an optimal algorithm for general offline CA be forced to reassign more 
than an amortized constant number of codes per insertion or deletion? 

— What is the complexity of the general offline CA problem? 
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Abstract. We define a new invariant for the conjugacy of irreducible 
sofic shifts. This invariant, that we call the syntactic graph of a sofic 
shift, is the directed acyclic graph of characteristic groups of the non 
null regular X>-classes of the syntactic semigroup of the shift. 

Keywords: Automata and formal languages, symbolic dynamics. 



1 Introduction 

Sofic shifts [17] are sets of bi-infinite labels in a labeled graph. If the graph can be 
chosen strongly connected, the sofic shift is said to be irreducible. A particular 
subclass of sofic shifts is the class of shifts of finite type, defined by a finite set of 
forbidden blocks. Two sofic shifts X and Y are conjugate if there is a bijective 
block map from X onto Y. It is an open question to decide whether two sofic 
shifts are conjugate, even in the particular case of irreducible shifts of finite type. 

There are many invariants for conjugacy of subshifts, algebraic or combinato- 
rial, see [13, Chapter 7], [6], [12], [3]. For instance the entropy is a combinatorial 
invariant which gives the complexity of allowed blocks in a shift. The zeta func- 
tion is another invariant which counts the number of periodic orbits in a shift. 

In this paper, we define a new invariant for irreducible sofic shifts. This invari- 
ant is based on the structure of the syntactic semigroup of the language of finite 
blocks of the shift. Irreducible sofic shifts have a unique (up to isomorphisms of 
automata) minimal deterministic presentation, called the right Fischer cover of 
the shift. The syntactic semigroup S of an irreducible sofic shift is the transition 
semigroup of its right Fischer cover. 

In general, the structure of a finite semigroup is determined by the Green’s 
relations (denoted TZ, £, H, T>, J) [16]. Our invariant is the acyclic directed graph 
whose nodes are the characteristic groups of the non null regular 2?-classes of S. 
The edges correspond to the partial order <j between these P-classes. We call 
it the syntactic graph of the sofic shift. The result can be extended to the case 
of reducible sofic shifts. 

The proof of the invariant is based on Nasu’s Classification Theorem for 
sofic shifts [15] that extends William’s one for shifts of finite type. This 
theorem says that two irreducible sofic shifts X, Y are conjugate if and 
only if there is a sequence of transition matrices of right Fischer covers 
A = Aq, Ai,... ,Ai-i,Ai = B, such that Ai-i,Ai are elementary strong 
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shift equivalent for 1 < i < I, where A and B are the transition matrices of the 
right Fischer covers of X and Y, respectively. This means that there are transi- 
tion matrices Ui,Vi such that, after recoding the alphabets of Ai_i and Ai, we 
have Ai-i = UiVi and Ai = ViUi. A bipartite shift is associated in a natural way 
to a pair of elementary strong shift equivalent and irreducible sofic shifts [15]. 

The key point in our invariant is the fact that an elementary strong shift 
equivalence relation between transition matrices implies some conjugacy rela- 
tions between the idempotents in the syntactic semigroup of the bipartite shift. 

We show that particular classes of irreducible sofic shifts can be characterized 
with this syntactic invariant: the class of irreducible shifts of finite type and the 
class of irreducible aperiodic sofic shifts. 

Basic definitions related to symbolic dynamics are given in Section 2.1. We 
refer to [13] or [9] for more details. See also [10], [11], [4] about sofic shifts. 
Basic definitions and properties related to finite semigroups and their structure 
are given Section 2.2. We refer to [16, Chapter 3] for a more comprehensive 
expository. Nasu’s Classification Theorem is recalled in Section 2.3. We define 
and prove our invariant in Section 3. A comparison of this syntactic invariant to 
some well known other ones is given in Section 4. Proofs of Propositions 1 and 
2 are omitted. The extension to the case of reducible sofic shifts is discussed at 
the end of Section 3. 

2 Definitions and Background 

2.1 Sofic Shifts and Their Presentations 

Let A be a finite alphabet, i.e. a finite set of symbols. The shift map ct : — >■ 

is defined by cr((ai)igz) = (ai+i)igz, for {ai)i^z G If is endowed with the 
product topology of the discrete topology on A, a subshift is a closed a-invariant 
subset of A^. 

If A is a subshift of A?" and n a positive integer, the nth higher power of X 
is the subshift of (A”)^ defined by A" = {(oi„, . . . , ai„+„_i)igz [ (adiez G X}. 

A finite automaton is a finite multigraph labeled on A. It is denoted A = 
(Q,E), where Q is a finite set of states, and E a finite set of edges labeled on 
A. It is equivalent to a symbolic adjacency {Q x Q) -matrix A, where Apq is the 
finite formal sum of the labels of all the edges from p to q. A sofic shift is the 
set of the labels of all the bi-infinite paths on a finite automaton. If A is a finite 
automaton, we denote by Xa the sofic shift defined by the automaton A. Several 
automata can define the same sofic shift. They are also called presentations or 
covers of the sofic shift. We will assume that all presentations are essential: all 
states have at least one outgoing edge and one incoming edge. An automaton 
is deterministic if for any given state and any given symbol, there is at most 
one outgoing edge labeled with this given symbol. A sofic shift is irreducible if it 
has a presentation with a strongly connected graph. Irreducible sofic shifts have 
a unique (up to isomorphisms of automata) minimal deterministic presentation 
called the right Fischer cover of the shift. 
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Let A = {Q, E) be a finite deterministic (essential) automaton on the al- 
phabet A. Each finite word w of A* defines a partial function from Q to Q. 
This function sends the state p to the state q, if w is the label of a path form 
p to q. The semigroup generated by all these functions is called the transition 
semigroup of the automaton. When Xa is not the full shift, the semigroup has 
a null element, denoted 0, which corresponds to words which are not factors of 
any bi-infinite word of Xa- The syntactic semigroup of an irreducible sofic shift 
is defined as the transition semigroup of its right Fischer cover. 

Example 1. The sofic shift presented by the automaton of Figure 1 is called the 
even shift. Its syntactic semigroup is defined by the table in the right part of the 
figure. 





1 


2 


a 


1 


- 


b 


2 


1 


ab 


“2 




ba 


- 


1 


bb 


1 


2 


bah 


- 


T 


aba 


- 


- 




Fig. 1. The right Fischer cover of the even shift and its syntactic semigroup. Since aa 
and a define the same partial function from Q to Q, we write aa — a in the syntactic 
semigroup. We also have aba — 0, or ab^^^^a — 0 for any nonnegative integer k. The 
word bb is the identity in this semigroup. 



2.2 Structure of Finite Semigroups 

We refer to [16] for more details about the notions defined in this section. 

Given a semigroup S, we denote by 5”^ the following monoid: if S' is a monoid, 
S^ = S. If S is not a monoid, S^ = S U {1} together with the law * defined by 
x*y = xy if x,y€ S and l*a; = a;*l = a; for each x € S^. 

We recall the Green’s relations which are fundamentals equivalence relations 
defined in a semigroup S. The four equivalence relations TZ, L, H, J are defined 
as follows. Let x,y £ S, 



xTZy xS^ = yS^, 

xCy S^x = S^y, 

xjy ^ S^xS^ = S^yS\ 
xHy <t4> xTZy and xCy. 
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Another relation T> is defined by: 

xT>y <G> 3z G S' xTZz and zLy. 

In a finite semigroup J = T>. We recall the definition of the quasi-order <j\ 

X <j y S^xS^ C S^yS^. 

An TZ-class is an equivalence class for a relation TZ (similar notations hold for the 
other Green’s relations). An idempotent is an element e G S such that ee = e. A 
regular class is a class containing an idempotent. In a regular 2?-class, any "H-class 
containing an idempotent is a maximal subgroup of the semigroup. Moreover, 
two regular "H-classes contained in a same P-class are isomorphic (as groups), see 
for instance [16, Proposition 1.8]. This group is called the characteristic group 
of the regular 2?-class. The quasi-order <j induces a partial order between the 
2?-classes (still denoted <j). The structure of the transition semigroup S is often 
described by the so called “egg-box” pictures of the 2?-classes. 

We say that two elements x,y € S are conjugate if there are elements u,v € 
such that x = uv and y = vu. Two idempotents belong to a same regular 
P-class if and only if they are conjugate, see for instance [16, Proposition 1.12]. 

Let S' be a transition semigroup of an automaton A = (Q,E) and x € S. The 
rank of x is the cardinal of the image of a; as a partial function from Q to Q. 
The kernel of x is the partition induced by the equivalence relation ^ over the 
domain of x where p ~ g if and only p, q have the same image by x. The kernel 
of x is thus a partition of the domain of x. We describe the egg-box pictures 
with Example 1 continued in Figure 2. 



12 1 2 



* 

a 


ab 


ba 


*bab 







Fig. 2. The syntactic semigroup of the even shift of Example 1 is composed of three 
I>-classes Di, D 2 , D 3 , of rank 2, 1 and 0, respectively, represented by the above tables 
from left to right. Each square in a table represents an H-class. Each row represents an 
7?.-class and each column an Zl-class. The common kernel of the elements in each row is 
written on the left of each row. The common image of the elements in each column is 
written above each column. Idempotents are marked with the symbol *. Each X>-class 
of this semigroup is regular. The characteristic groups of Di, D 2 , D 3 are Z/2Z, the 
trivial group Z/Z and Z/Z, respectively. 



Let X be an irreducible sofic shift and S its syntactic semigroup. It is known 
that S has a unique 2?-class of rank 1 which is regular (see [4] or [5], see also 
[ 8 ]). 
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We define a finite directed acyclic graph (DAG) associated with X as follows. 
The set of vertices of the DAG is the set of non null regular 2?-classes of S, but 
the regular 2?-class of null rank, if there is one. Each vertex is labeled with the 
rank of the P-class and its characteristic group. There is an edge from the vertex 
associated with a 2?-class D to the vertex associated with a P-class D' if and only 
if D' <j D. We call this acyclic graph the syntactic graph of X (see Figure 3 
for an example). Note that the regular 2?-class of null rank, if there is one, is 
not taken into account in a syntactic graph. This is linked to the fact that a full 
shift (i.e. the set of all bi-infinite words on a finite alphabet) can be conjugate 
to a non full shift. 



rank 2, Z/2Z ^ 



^ rank 1. Z/Z ^ 



Fig. 3. The syntactic graph of the even shift of Example 1. We have D 2 <j D\ since, 
for instance, S^abS^ C S^hS^ . 



2.3 Nasu’s Classification Theorem for Sofic Shifts 

In this section, we recall Nasu’s Glassification Theorem for sofic shifts [15] (see 
also [13, p. 232]), which extends William’s Glassification Theorem for shifts of 
finite type (see [13, p. 229]). 

Let X C Y C he two subshifts and to, a be nonnegative inte- 
gers. A map : A — >■ y is a {m,a)-block map (or {m,a)-factor map) if 
there is a map 5 : B such that </>((ai)jgz) = {bi)i^z where 

5{ai-m ■ ■ ■ ai-iaiQi+i . . . Ui+a) = bi. A block map is a (to, a)-block map for some 
nonnegative integers to, a. The well known theorem of Gurtis, Hedlund and Lyn- 
don [7] asserts that continuous and shift-commuting maps are exactly block 
maps. A conjugacy is a one-to-one and onto block map (then, being a shift 
compact, its inverse is also a block map). 

Let A be a symbolic adjacency (Q x Q)-matrix of an automaton A with 
entries in a finite alphabet A. Let .B be a finite alphabet and / a one-to-one map 
from A to B. The map / is extended to a morphism from finite formal sums of 
elements of A to finite formal sums of elements of B. We say that / transforms 
A into an adjacency (Q x Q)-matrix B if Bpq = f{Apq). 

We now define the notion of strong shift equivalence between two symbolic 
adjacency matrices. 
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Let A and B be two finite alphabets. We denote by AB the set of words ab 
with a G A and b G B. 

Two symbolic adjacency matrices A, with entries in A, and B, with entries in 
B, are elementary strong shift equivalent if there is a pair of symbolic adjacency 
matrices {U,V) with entries in disjoint alphabets 14 and V respectively, such that 
there is a one-to-one map from A to 14V which transforms A into UV, and there 
is a one-to-one map from B to V14 which transforms B into VU . 

Two symbolic adjacency matrices A and B are strong shift equivalent within 
right Fischer covers if there is a sequence of symbolic adjacency matrices of right 
Fischer covers 



A — Aq,Ai, .. . , Ai_i, Ai — B 

such that for 1 < i ^ the matrices Ai-i tand Ai are elementary strong shift 
equivalent. 

Theorem 1 (Nasu). Let X and Y be irreducible sofic shifts and let A and 
B be the symbolic adjacency matrices of the right Fischer covers of X and Y, 
respectively. Then X and Y are conjugate if and only if A and B are strong shift 
equivalent within right Fischer covers. 



Example 2. Let us consider the two (conjugate) irreducible sofic shifts X and Y 
defined by the right Fischer covers A = {Q, E) and B = {Q' , E') in Figure 4. 





Fig. 4. Two conjugate shifts X and Y . 



The symbolic adjacency matrices of these automata are respectively 
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Then A and B are elementary strong shift equivalent with 



Indeed, 



Ml 0 M2 
0 M2 0 



V 



Ml 0 
M2 0 
0 M2 



p - 




ViUi 


0 


VIU2 


UiVi U2V2 
U2V2 0 


, vu = 


M2M1 


0 


V2U2 




0 




0 






M2M2 



The one-to-one maps from A = {a, b} to UV and from B = {o', b', c', d'} to VU 
are described in the tables below. 



a' 


ViUi 


b' 


V 2 U 2 


c' 


V 2 U 1 


d' 


V 1 U 2 



a 


UiVi 


b 


U2V2 



An elementary strong shift equivalence enables the construction of an irreducible 
sofic shift Z on the alphabet W U V as follows. The sofic shift Z is defined by the 
automaton C = {Q U Q' , F) , where the symbolic adjacency matrix C of C is 

QQ' 

Q \0 U 
Q' [y 0 • 

The shift Z is called the bipartite shift defined by U, V (see Figure 5). An edge 
of C labeled on 14 goes from a state in Q to a state in Q' . An edge of C labeled 
on V goes from a state in Q' to a state in Q. Remark that the second higher 
power of Z is the disjoint union of X and Y. Note also that C is a right Fischer 
cover (i.e. is minimal). 




Fig. 5. The bipartite shift Z. 
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3 A Syntactic Invariant 

In this section, we define a syntactic invariant for the conjugacy of irreducible 
sofic shifts. 

Theorem 2. Let X and Y he two irreducible sofic shifts. If X and Y are con- 
jugate, then they have the same syntactic graph. 

We give a few lemmas before proving Theorem 2. 

Let X (respectively Y) be an irreducible sofic shift whose symbolic adjacency 
matrix of its right Fischer cover is a (Q x (5)-matrix (respectively {Q' x Q')- 
matrix) denoted by A (respectively by B) . We assume that A and B are elemen- 
tary strong shift equivalent through a pair of matrices (U, V). The corresponding 
alphabets are denoted A, B, U, and V as before. We denote by / a one-to-one 
map from A to UV which transforms A into UV and by g a one-to-one map 
from B to VU which transforms B into VU . Let Z be the bipartite irreducible 
sofic shift associated to U, V. We denote by S (respectively T, R) the syntactic 
semigroup of X (respectively Y, Z). 

Let w G R. If w is non null, the bipartite nature of Z implies that w is a 
function from QUQ' to QUQ' whose domain is included either in Q or in Q' , and 
whose image is included either in Q or in Q'. If w ft 0 with a domain included 
in P and an image included in P' , we say that w has the type {P,P'). Remark 
that w has type {Q,Q) if and only if w ft Q and w G {f{A))*, and w has type 
{Q', Q') if and only if ru yf 0 and w G (g{B))*. 

Lemma 1. Elements of R in a same non null H-class have the same type. 

Proof We show the property for the (Q, (5)-type. Let w £ H and w of type 
(Q,Q). If rc = w'v with w',v G R, then w' has type (Q,*). If w = zw' with 
0 , w' G R, then w' has type (*, Q). Thus, wRw' implies that w' has type (Q, Q). 
□ 



The "H-classes of R containing elements of type {Q, Q) (respectively {Q', Q')) 
are called {Q,Q)-R- classes (respectively (Q', Q')-'H-classes). 

Let w = oi . . . a„ be an element of S, we define the element f{w) as /(oi) 

• • • f(cin)- Note that this definition is consistent since if oi . . . a„ = . . . a(„ in 

S, then /(ai) . . . fiaft) = f{a'ft) . . . f{a'jft) in R. Similarly we define an element 
g{w) for any element w of T. 

Conversely, let w be an element of R belonging to f{A)* (C (UV)*). Then 
w = /(ai) . . . f{a„), with Ui G A. We define f~^{w) as oi . . . a„. Similarly we 
define g~^{w). Again these definitions and notations are consistent. Thus / is a 
semigroup isomorphism from S to the subsemigroup of R of transition functions 
defined by the words in (/(A))*. Notice that /(O) = 0 if 0 G S'. Analogously, 
g is a semigroup isomorphism from T to the subsemigroup of R of transition 
functions defined by the words in {g{B))* . 

Lemma 2. Let w,w' G R of type (Q,Q). Then wHw' in R if and only if 
f~^{w)Rf~^{w') in S. 
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Proof Let w = /(ai) . . . /(fln) and w' = f{a'i) . . . with Oi, a' G A. We 

have w = w'v with v € R if and only if r; = /(oi) . . . /(oj.) with G A and 
/(oi) . . . /(o„) = f{a[) . . . /(o^)/(di) . . . /(dr). This is equivalent to ai . . . a„ = 
a'l . . .a'^ai . . .Ur, that is f~^{w)R^ C f~^{w')R^. Analogously, we have w' = 
wv' with v' G R, if and only if f~^{w')R^ C f~^{w)R^. This proves that wTZw' 
in R if and only if f~^{w)TZf~^{w') in S. In the same way, one can prove the 
same statement for the relation C and hence for the relation R. □ 

A similar statement holds for (Q', <5')“^“Classes. 

Lemma 3. Let w,w' G R of type (Q,Q). Then w <j w' in R if and only if 
f~^{w) <j f~^{w') in S. This implies that wjw' in R if and only if f~^{w) 
J f~^{w') in S. 

Proof The first statement can be prooved as in the previous lemma. □ 

Similar results hold between T and R. As a consequence we get the following 
lemma. 

Lemma 4. The bisection f between S and the elements of R belonging to 
(f{A))*, induces a bisection between the non null TL-classes of S and the {Q,Q)~ 
TL-classes of R. Moreover this bisection keeps the relations J , <j and the rank 
of the R- classes. 

A similar statement holds for the bijection g. 

We now come to the main lemma, which shows the link between the ele- 
mentary strong shift equivalence of the symbolic adjacency matrices and the 
conjugacy of some idempotents in the semigroup. This link is the key point of 
the invariant. 

Lemma 5. Let H be a regular {Q,Q)-R-class of R. Then there is a regular 
{Q' ,Q')-R-class in the same V-class as H. 

Proof Let e G i? be an idempotent element of type {Q, Q). Let u\Vi . . . u„Vn in 
(UV)* such that e = uiVi . . . UnVn- We define e = vi . . . u„VnUi. Thus eui = u\e 
in R. Remark that e depends on the choice of the word u\Vi . . . UnV„ representing 
e in R. 

If w denotes vi . . .u„Vn and v denotes ui, we have e = vw and e = wv. It 
follows that e and e are conjugate, thus = e and are conjugate. Moreover 

= wvwvwv = weev = wev = wvwv = e? . 

Thus f? is an idempotent conjugate to the idempotent e. As a consequence e 
and e? belong to a same 2?-class of R (see Section 2), and yf 0. The result 
follows since is of type (Q', Q')- 

Note that the number of regular (Q, Q)-’H-classes and the number of regular 
(Q', Q')-’H-classes in a same 2?-class of A, may be different in general. 
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We now prove Theorem 2. 

Proof[of Theorem 2] By Nasu’s Theorem [15] we can assume, without loss of 
generality, that the symbolic adjacency matrices of the right Fischer covers of 
X and Y are elementary strong shift equivalent. We define the bipartite shift Z 
as above. We denote by S, T and R the syntactic semigroups of X, Y and Z 
respectively. 

Let D he a non null regular 2?-class of S. Let H he a regular "H-class of 
S contained in D. Let H” = f{H). By Lemma 4, the groups H and H” are 
isomorphic. Let D” the 2?-class of R containing H”. By Lemma 5, there is at 
least one regular {Q' ,Q')-TL-class K" in D" , which is isomorphic to H” . Let 
H' = and let D' be the 2?-class of T containing H' . By Lemma 4, the 

groups H' and K" are isomorphic. Hence the groups H and H' are isomorphic. 

By Lemmas 4 and 5, we have that the above construction of D' from H is a 
bijective function ip from the non null regular 2?-classes of S onto the non null 
regular 2?-classes of T. Moreover the characteristic group of D is isomorphic to 
the characteristic group of (p{D) and, by Lemma 4, the rank of D is equal to the 
rank of ip{D). 

We now consider two non null regular 2?-classes D\ and D 2 of S'. By Lemma 4 
and Lemma 5, Di <j if and only if ip{Di) <j (p{D 2 ). It follows that the 
syntactic graphs of S and T are isomorphic through the bijection ip. □ 

Nasu’s Classification Theorem holds for reducible sofic shifts by the use of 
right Krieger covers instead of right Fischer covers [15]. This enables the ex- 
tension of our result to the case of reducible sofic shifts. This extension is not 
described in this short version of the paper. 

4 How Dynamic Is This Invariant? 

We briefly compare the syntactic conjugacy invariant with other classical con- 
jugacy invariants. We refer to [13] for the definitions and properties of these 
classical invariants. 

First, on can remark that the syntactic invariant does not capture all the 
dynamic. Two sofic shifts can have the same syntactic graph and a different 
entropy, see the example given in Figure 6. 

The comparison with the zeta function is more interesting. Recall that the 
zeta function of a shift X is C(-^) = X)n>i where is the number of 
bi-infinite words x G X such that a"(x) = x. We give in Figure 7 an example 
of two irreducible sofic shifts which have the same zeta function and different 
syntactic graphs. 

Irreducible shifts of finite type can be characterized with this syntactic in- 
variant. Other equivalent characterizations of finite type shifts can be found in 
[14] and in [8]. 

Proposition 1. An irreducible sofic shifts is of finite type if and only its syn- 
tactic graph is reduced to one node of rank 1 representing the trivial group. 
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Fig. 6. The two above sofic shifts X, Y have the same syntactic graph and a different 
entropy. Indeed, we have 6 = c in the syntactic semigronp of Y. Hence the shifts X 
and Y have the same syntactic semigroup. 



Another interesting class of irreducible sofic shifts can be characterized with 
the syntactic invariant. It is the class of aperiodic sofic shifts [1]. 

Let X £ X, we denote by period(a;) the least positive integer n such that 
cr”(a;) = a; if such an integer exists. It is equal to oo otherwise. 

Let X, Y be two subshifts and let (/) : A — >■ y be a block map. The map is 
said aperiodic if period(a;) = period(</>(a;)) for any x £ X. Roughly speaking, 
such a factor map (j) does not make periods decrease. 

A sofic shift X if aperiodic if it is the image of a shift of finite type by an 
aperiodic block map. A characterization of irreducible aperiodic sofic shifts is 
the following. 

Proposition 2. An irreducible sofic shift is aperiodic if and only if its syntactic 
graph contains only trivial groups. 

Schiitzenberger’s characterization of aperiodic languages (see for instance [16, 
Theorem 2.1]) asserts that the set of blocks of an aperiodic sofic shift is a regular 
star free language. 




Fig. 7. Two sofic shifts X, Y which have the same zeta function (see for 

instance [13, Theorem 6.4.8], or [2] for the computation of the zeta function of a 
sofic shift), and different syntactic invariants. Indeed the syntactic graph of X is 
(rank 2,Z/2Z) — >■ (rank 1,Z/Z) while the syntactic graph of Y has only one node 
(rank 1,Z/Z). Thus they are not conjugate. Notice that F is a shift of finite type. 
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Abstract. The relationship between the length of a word and the max- 
imum length of its unbordered factors is investigated in this paper. 

A word is bordered, if it has a proper prefix that is also a suffix of that 
word. Consider a finite word w of length n. Let fi{w) denote the maximum 
length of its unbordered factors, and let d{w) denote the period of w. 
Clearly, /r(w) < d(w). 

We establish that /i(w) = d{w), if w has an unbordered prefix of length 
/r(w) and n > 2/r(w) — 1. This bound is tight and solves a 21 year old con- 
jecture by Duval. It follows from this result that, in general, n > 3jj.{w) 
implies /i(w) = d{w) which gives an improved bound for the question 
asked by Ehrenfeucht and Silberger in 1979. 



1 Introduction 

Periodicity and borderedness are two properties of words which are investigated 
in this paper. These concepts are foundational and play a role (explicitely or im- 
plicitely) in many areas of computer science. Just a few of those areas are pattern 
matching algorithms [15,3,7], data compression [19,6], and codes [2], which are 
classical examples, but also computational biology, e.g., sequence assembly [17] 
or superstrings [4] , and serial data communications systems [5] are areas among 
others where periodicity and borderedness of words (sequences) are important 
concepts. It is well known that these two word properties do not exist indepen- 
dently from each other. However, it is somewhat surprising that no clear relation 
has been established so far, despite the fact that this basic question has been 
around for more than 20 years. 

Let us consider a finite word (a sequence of letters) w. We denote the length 
of w by [rul and call a subsequence of consecutive letters of a word factor. The 
period of ru, denoted by d{w), is the smallest positive integer p such that the 
i-th letter equals the (i-l-p)-th letter for all 1 < t < |w| —p. Let p{w) denote the 
length of the longest unbordered factor of w. A word is bordered, if it has a proper 
prefix that is also a suffix, where we call a prefix proper, if it is neither empty 
nor contains the entire word. For the investigation of the relationship between 
[rul and the maximality of p{w), that is, p{w) = d{w), we consider the special 
case where the longest unbordered prefix of a word is of the maximum length, 
that is, no unbordered factor is longer than that prefix. Let w be an unbordered 
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word. Then a word wu is a Duval extension (of w), if every unbordered factor 
of wu has at most length |w|, that is, fj,{wu) = licl. We call wu trivial Duval 
extension, if d{wu) = licl. For example, let w = abaabb and u = aaba. Then 
wu = abaabbaaba is a nontrivial Duval extension of w since (z) w is unbordered, 
(ii) all factors of wu longer than w are bordered, that is, |rc| = ^{wu) = 6, and 
{Hi) the period of wu is 7, and hence, d{wu) > |w|. Note, that this example 
satisfies |w| = |z(;| — 2. 

In 1979 Ehrenfeucht and Silberger initiated a line of research [10,1,8] explor- 
ing the relationship between the length of a word w and ia{w). In 1982 these 
efforts culminated in Duval’s result: If licl > ifi{w) — 6 then d{w) = ^J,{w). How- 
ever, it was conjectured in [1] that licl > 3^{w) implies d{w) = ^J,{w) which 
follows if Duval’s conjecture [8] holds true. 

Conjecture 1. If wu is a Duval nontrivial extension of w, then juj < jruj. 

After that, no progress was recorded, to the best of our knowledge, for 20 years. 
However, the topic remained popular, see for example Chapter 8 in [16]; for 
recent results see [18] and [9]. Recently, a first improvement of Duval’s result was 
introduced in [11] where it was shown that a Duval extension of w longer than 
5ii{w)/2— 1 is trivial. However, the main result of our contribution here is a final 
characterization of this border by proving an improved version of Conjecture 1. 

Theorem 2. If wu is a Duval nontrivial extension ofw, then |m| < [■u;| — 1. 
The example mentioned above shows that this bound on the length of a nontrivial 
Duval extension is tight. Theorem 2 implies the truth of Duval’s conjecture, as 
well as, the following corollary (for any word w). 

Corollary 3. If |w| > 3/r(rc), then d{w) = n{w). 

This corollary confirms the conjecture by Assous and Pouzet in [1] about a ques- 
tion asked by Ehrenfeucht and Silberger in [10]. 

Our main result. Theorem 2, is presented in Section 4, which uses the nota- 
tions introduced in Section 2 and preliminary results from Section 3. We conclude 
with Section 5. 



2 Notations 

In this section we introduce the notations of this paper. We refer to [16] for more 
basic and general definitions. 

We consider a finite alphabet A of letters. Let A* denote the set of all finite 
words over A including the empty word, denoted by e. Let w = W(i)W( 2 ) • • • W(„) 
where ru(i) is a letter, for every 1 < i < n. We denote the length n of ic by |rc|. 
An integer 1 < p < n is a period of w, if W(j) = W(i+p) for all 1 < z < n — p. 
The smallest period of w is called the minimum period (or simply, the period) 
of w, denoted by d{w). A nonempty word u is called a border of a word w, if 
w = uv = v'u for some suitable words v and v' . We call w bordered, if it has 
a border that is shorter than w, otherwise w is called unhordered. Note, that 
every bordered word w has a minimum border u such that w = uvu, where u is 
unbordered. Let p{w) denote the maximum length of unbordered factors of w. 
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Suppose w = uv, then u is called a prefix of w, denoted hy u < w, and v is called 
a sujfix of w, denoted hy v ^ w. Let u,v e. Then we say that u overlaps v 
from the left or from the right, if there is a word w such that |r(;| < |m| + |u|, and 
u < w and v ^ w, or v < w and u ^ w, respectively. We say that u overlaps 
(intersects) with v, if either w is a factor of rt or u is a factor of u or u overlaps 
V from the left or right. 

Let us consider the following examples. Let A = {a,h} and u,v,w G A* such 
that u = abaa and v = baaba and w = abaaba. Then Iml = 6, and 3, 5, and 6 
are periods of w, and d{w) = 3. We have that a is the shortest border of u and 
w, whereas ba is the shortest border of v. We have ix{w) = 3. We also have that 
u and V overlap since u < w and v ^ w and Iml < |u| + |u|. 

We continue with some more notations. Let w and u be nonempty words 
where w is also unbordered. We call wu a Duval extension of w, if every factor 
of wu longer than |rt;| is bordered, that is, p,{wu) = licl. A Duval extension wu 
of w is called trivial, if d{wu) = p,{wu) = |w|. A nontrivial Duval extension wu 
of w is called minimal, if u is of minimal length, that is, u = u'a and w = u'bw' 
where a,b G A and a b. 

Example 4- Let w = abaabbabaababb and u = aaba. Then 

w.u = abaabbabaababb. aaba 

(for the sake of readability, we use a dot to mark where w ends) is a nontrivial Du- 
val extension of w of length IicmI = 18, where p,{wu) = |w| = 14 and d{wu) = 15. 
However, wu is not a minimal Duval extension, whereas 

w.u = abaabbabaababb. aa 

is minimal, with u' = aa < u. Note, that wu is not the longest nontrivial Duval 
extension of w since 



w.v = abaabbabaababb. abaaba 

is longer, with v = abaaba and |wu| = 20 and d{wv) = 17. One can check that 
wv is a nontrivial Duval extension of w of maximum length, and at the same 
time wv is also a minimal Duval extension of w. 

Let an integer p with 1 < p < |ru| be called point in w. Intuitively, a point p 
denotes the place between and rc(p+i) in w. A nonempty word u is called a 
repetition word at point p if w = xy with \x\ = p and there exist x' and y' such 
that u =4 x'x and u < yy' . For a point p in w, let 

d{w,p) = minjlrtl | m is a repetition word at p} 

denote the loeal period at point pinw. Note, the repetition word of length d{w,p) 
at point p is necessarily unbordered, and moreover, d{w,p) < d{w). A factoriza- 
tion w = uv, with u,v £ and |u| = p, is called eritieal, if d{w,p) = d{w), and, 
if this holds, then p is called critical point. 

Example 5. The word 



w = ab.aa.b 
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has the period d(w) = 3 and two critical points, 2 and 4, marked by dots. The 
shortest repetition words at the critical points are aab and baa, respectively. 
Note, that the shortest repetition words at the remaining points 1 and 3 are ba 
and a, respectively. 

3 Preliminary Results 

We state some auxiliary and well-known results about repetitions and borders 
in this section which will be used to prove Theorem 2, in Section 4. The proofs 
of these auxiliary results are straightforward and not given in this extended 
abstract. Results taken from the literature are referenced to. 

Lemma 6. Let zf = gzh where f,g ^ s. Let az' he the maximum unhordered 
prefix of az. If az does not occur in zf, then agz' is unhordered. 

Lemma 7. Let w be an unhordered word and u < w and v ^ w. Then uw and 
wv are unhordered. 

The following lemma was proven in [4] . 

Lemma 8. Let w = uv be unhordered and |m| he a critical point of w. Then u 
and V do not overlap. 

The next result follows directly from Lemma 8. 

Lemma 9. Let uqU\ he unhordered and |mo| he a critical point of uqUi. Then 
for any word x, we have UiXUi+i, where the indices are modulo 2, is either 
unhordered or has a minimum border g such that \g\ > |mo| -I- |mi|. 

The following Lemmas 10, 11 and 12 and Corollary 13 are given in [8]. Let 
ao,oi € A, with oq ^ ai, and to G A*. Let the sequences (a^), (s^), (s'), (s"), 
and (U), for i > 1, be defined by 

— Oi = Oi (mod 2 )) that is, Oi = oo or Oj = ai, if i is even or odd, respectively, 

— Si such that a^Si is the shortest border of aRi_i, 

— s' such that a^+is' is the longest unbordered prefix of Oj+iSi, 

— s" such that s's" = Sj, 

— ti such that tis'f = U-i. 

For any parameters of the above definition, the following holds. 

Lemma 10. For any Og, Oi, and to there exists an m > 1 such that 

|si| < |S2| < • • • < |Sm| = \tm-l\ < ’ ’ ’ < |tl| < |to| 

and Si < Si+i, for all 1 < i < m, and Sm = tm-i arid |to| < |sm| + |sm-i|- 
Lemma 11. Let z <to such that ogz and aiz do not occur in to. Let aoZo and 
aiZi be the longest unhordered prefixes of aoz and a\z, respectively. Let m he the 
smallest integer such that Sm = tm-i- Then 

1. if m = 1 then a\to is unhordered, 

2. if m > 1 is odd, then aiSm is unhordered and |to| < |sm| + l-^ol; 

3. if m > 1 is even, then aoSm is unhordered and |to| < |sm| + \zi\. 

Lemma 12. Let v be an unhordered factor ofw of length g{w). Ifv occurs twice 
in w, then n{w) = d{w). 

Corollary 13. Let wu he a Duval extension ofw. If w occurs twice in wu, then 
wu is a trivial Duval extension. 
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4 Main Results 

The first theorem in this section states a basic fact about minimal Duval ex- 
tensions which is also used in the proof of Theorem 2. We omit the proof of it 
because of space limitations. 

Theorem 14. Let wu be a minimal Duval extension of w. Then au occurs in w 
where a ^ w and a € A. 

We come now to the main result of this paper that we already mentioned in 
the introduction. It proves Duval’s conjecture. 

Theorem 2. If wu is a nontrivial Duval extension ofw, then |m| < Iml — 1. 

Proof (sketch). Recall that every factor of wu which is longer than |t(;| is bordered 
since wu is a Duval extension of w. Let z be the longest suffix of w that occurs 
twice in zu. 

If z = e then a ^ w where a € A and a does not occur in u. Let w = u'bw” 
and u = u' cu” such that b,c € A and b ^ c. Then w = WqUu'cw'i by Theorem 14. 
Consider the factor au'cw[u which has to be bordered with a shortest border g 
such that |au| < |g| and g occurs in w. Hence, |u| < |w| — 1. 

So, assume z ^ e. We have z ^ w since wu is otherwise trivial by Corollary 13. 
Let a, 6 G 4 be such that 



w = w'az and u = u'bzr 

and z occurs in zr only once, that is, bz matches the rightmost occurrence of z 
in u. Note, that bz does not overlap az from the right, by Lemma 7, and therefore 
u' exists, although it might be empty. Naturally, a ^ bhy the maximality of z, 
and w' ^ e, otherwise azu'bz < wu has either no border or w is bordered (if 
azu'bz has a border not longer than z) or az occurs in zu (if azu'bz has a border 
longer than z); a contradiction in any case. 

Let azo and bzi denote the longest unbordered prefix of az and bz, respec- 
tively. Let qq = a and ai = b and to = zr and the integer m be defined as in 
Lemma 11. We have then a word Sm, with its properties defined by Lemma 10 
and 11, such that 



Consider azu'bzQ. We have that az and azu'bzo are both prefixes of oqzu, and bzo 
is a suffix of azu'bzo and az does not occur in zu'bzo- It follows from Lemma 6 
that azu'bzo is unbordered, and hence, 

|azu^&zo| < |w| . (1) 



w 



a z 



b z 



.Zq, 


\Zq, 




s t' 
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Case: m is even. Then we have m > 2 and aSm (= dmSm) is unbordered and 
l^ol < |sm| + \zi\ by Lemma 11. 

Suppose |to| = |sm| + \zi\ and zi = 2 ;. Then \z\ < |sm-i| by Lemma 10. Note, 
that we have an immediate contradiction, if m = 2 since then |to| = |ti| + |s"| 
where ti = S 2 and |s"| = \z\ — \zo\, by the definition of (si) and (U), and hence, 
Zq = s and z = a^, for some A: > 1, and c < r, for some c G A and a ^ c otherwise 
az occurs in u, and finally, azu'bzc is unbordered, and hence, |u| < |t(;| — 1. So, 
assume m > 2. But now, bz occurs in to since bsm-i is a border of btm -2 and 
ti < toi for all 0 < f < m, which is a contradiction. 

So, assume that |to| < |sm| + \zi\ or \zi\ < \z\. Then |t'| < \z\. 

Subcase: |sm| < l^ol- Then \azu'bzo\ < |w| and 

|t6| = \azu\ — |z| — 1 

= \azu'bzo\ - \zo\ + |to| - 1^1 - 1 

< \azu'bzo\ - |zo| + |sm| + l^^il - l^l - 1 

< |w| + |zi| - |2| - 1 

< |ic| — 1 

if |to| < |sm| + kil, or 

|t6| = \azu\ — |z| — 1 

= \azu'bzo\ - \zo\ + |to| ~ \z\ - I 

< \azu'bzo\ - \zo\ + |s^| + |zi| - |z| - 1 

< |w| + |zi| - |2| - 1 

< |tc| — 1 

if \zi\ < \z\. We have |m| < |w| — 1 in both cases. 

Subcase: js^l > l-^ol- Then we have that aSm is unbordered, and since azo 
is the longest unbordered prefix of az, we have that az is a proper prefix of aSm, 
and hence, \z\ < |sm|. Now, azu'bsm is unbordered otherwise its shortest border 
is longer than az, since no prefix of az is a suffix of aSm, and az occurs in m; 
a contradiction. So, \azu'bsm\ < |w| and |u| < |w| — 1 since \t'\ < \z\. 

Case: m is odd. Then bsm (= amSm) is unbordered and |to| < |sm| + l-^ol; 
see Lemma 11. Note, that to = Sm and t' = e, if m = 1 by Lemma 11. Surely 
Sm yf £• Note, in particular 

\t'\ < kol . 

If |sm| < 1^1, then |m| < IicI — 1 since 

|m| = \azu'bz[)\ — |6zol + l^fol ~ \<zz\ 

and \azu'bzo\ < |w|, by (1), and |to| < |sm| + \zo\. 

Assume |sm| > \z\. From |6sm| > 2 it follows that there exists a critical point 
p in bsm such that bsm = vqVi, where |uo| = P, by the critical factorization 
theorem; see [16]. 
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w ^ M 



a z \ 


u' 


b z r \ 


Zq, 




N s i' 

: ; , b , 






Wo vi\ 



From this follows 



bz < VqVi . ( 2 ) 

Note, that if Sm = z then |m| < |r(;| — 1 since we have |<o| < |sm| + |sm-i| and 
|sm-i| < |sm|, by Lemma 10, and 

|t6| = \azu\ — |z| — 1 

= \azu'bz\ — \z\ + |to| — \z\ — f 

< \azu'bz\ - |z| + |sm| + |sm-i| - 1^1 - 1 

< \w\ + -\z\-l 

< \w\ — I . 

We have therefore in general 

\zo\ < koi’il - 1 • (3) 

Let 



U = UqVqViUi 

be such that vqVi does not occur in Uq. Note, that riowi does not overlap with 
itself since it is unbordered, and vq and vi do not overlap by Lemma 8. Consider 
the prefix wu'^bz of wu which is bordered and has a shortest border h longer 
than 2 , and hence, bz =4 h, otherwise w is bordered since z ^ w. So, bz occurs 
in w. Let 



w = wobzwi 

such that bz occurs in wgbz only once, that is, we consider the leftmost occurence 
of bz in w. Note, that 



\wobz\ < \h\ < \uQbz\ (4) 

where the first inequality comes from the definition of Wg above and the second 
inequality from the fact that \u'gbz\ < \h\ implies that w is bordered. Let 



/ = bzwiu'gVgVi . 



If / is unbordered. Then |/| < |r(;|, and hence, lugWoUil < |wo|. Now, we have 
I Mg I < 1 ^ 0 1 which contradicts (4). 

Assume / is bordered, and let h' be its shortest border. 




Periodicity and Unbordered Words 301 



^ ^ w 



a z 


v! bzr 


Wgh Z Wi 


u'g b. z ^ \ Ug , V\ t' 


u'g. \Vg Vi ^ Ui 


\ h' , \ hf \ 



w'n , vq ,vi , f b, z 



Surely, \hz\ < \h'\ otherwise vqV\ is bordered by (2). So, hz < h' . Moreover, 
I'yo^^il < otherwise hz occurs in Sm contradicting our assumption that bzr 
marks the rightmost occurence of bz in u. So, VqVi =4 h' , and VqVi occurs in w 
since woh' < w by (4). Let 



wobzv' = Woh' = w'qVqVi . 

Note, that uqUi does not occur in Wg otherwise it occurs in u'g contradicting our 
assumption. Moreover, we have h' = hzv' =4 u'gVgV\. Let u'gVgVi = u'gh' . Consider 

f' = wu'gbz 



which has a shortest border h" . 





a z . u'g b z ^ 


bzr 




\ Vg .Vi . 


\ Vg .Vi .t' 


Wgb Z 


W\ u'g b z ■■ 


Ui 


h" 


h" ; 





r 



Surely, hz ^ h" otherwise w is bordered with a suffix of z. Moreover, we have 
\wgbz\ < \h"\ < \u'gbz\ since hz does not occur in wg and w is unbordered. From 
that and wgh' = w'gVgVi and u'gh' = u'gVgVi follows now |wg| < |ug| and 

u'gVgVi = u'gbzv' and wg occurs in u'g. (5) 



Let now 



W = w'gVgViw'i ■ ■ ■ VgViw'2VgViw'iVgViW2 



and 



U = u'gVgViu'j ■ ■ ■ VoViu'2VoViu'iVgVit' 

such that uqUi does not occur in w'^, for all 0 < fc < i, or v'^, for all 0 < £ < j. 
Note, that this factorization of w and u is unique. 

We claim that i = j and w'^ = u'f., for all 1 < fc < L However, we omit 
the proof here for lack of space. The left out part of the proof first proceeds 
by induction on k' = min{i,j} showing that = uj,, for all 1 < fc < k', and 
then consideres the two cases i < j and j < i deriving a contradiction in each of 
them. 
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So assume, we have i = j and 



V = VqViw[ ■ ■ ■ VqViW^VqViw'-^ 
= VqViu[ ■ ■ ■ VqViU^VqViu'i 



and 



w = w'qVVqViW2 and u = u'^vv^vit' . 

Consider 

/o = ViW2u'qVVo ■ 

Subcase: /o is bordered. Then it has a shortest border = uigo^’o (where go 
is possibly empty). 



w ^ M 



a Z ; 


1 


)Z^r\ 


Vo ^ W2 ; Ug , 


V : 


Vg , vi f , 


;5o, Wo , 


, wi . 50 ; 




: ^0 ; 


; hg 





/o 



Recall, that W2 ^ s and either az =4 V1W2 or V1W2 =4 az. If |wiW2| < \az\ then v\ 
occurs in z, and hence, overlaps with vg since bz < VgVi; a contradiction. So, we 
have az =4 ViW2- Surely, |/io| < |uiW2| otherwise az occurs in u which contradicts 
our assumption. Let IV2 = go^ows- Note, that luoWal yf \az\ since az and vg begin 
with different letters. We have \az\ < It’o'ii'sl since otherwise vg occurs in z, and 
hence, overlaps with vi which is a contradiction. Consider now. 



/l = ■ 

If /i is unbordered, then |u| < licl — 1 . 



w 




u 




a z : 


b z r ■ 


.Vg Vi „9o, Wo 


wg : Wn , 


V \Vg Vit'\ 




hi 


, hi \ 


;gi, 


Vg , wi : fi 


: 5 i, Wo , wi : 



If /i is bordered, then it has a shortest border hi = giVgVi with |az| < 
otherwise az occurs in u. Let vgWg = giVgViW4. But, now 

W = w'gVVgVigogiVgViW4 

which contradicts our assumption that w = WQVVgViW2 and vgvi does not occur 
in W2- 

Subcase: fg is unbordered. Then |/o| < |ru|, and hence, IicqI > |mq|. But, we 
also have |ruo| < |ug|; see ( 5 ). That implies liCgl = |mq|. Moreover, the factors 
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Wo and bzv' have both nonoverlaping occurences in UqVoVi by (5). Therefore, 
Wg = u'g. Now, 



w = xaw '2 and u = xht" 

where w'gVVgVi < x and a,b € A and a ^ b and w '2 =4 W 2 and t" =4 t' ■ We have 
that xh occurs in w by Theorem 14. Since xh is not a prefix of w and vgvi does 
not overlap with itself, we have \xb\ + |uofi| < |tu|- From \t'\ < \zq\ < |voni| — 1 
we get |m| < IicI — 1 and the claim follows. □ 

Note, that the bound |u| < |rc| — 1 on the length of a nontrivial Duval 
extension wu of w is tight, as the example given in the introduction shows. 
Theorem 2 also implies a new bound on the length of any word w auch that 
d{w) = must hold. 

Corollary 3. If |w| > — 2 then d{w) = ^(w). 

5 Conclusions 

In this paper we have given a confirmative answer to a long standing conjecture 
[8] by proving that a Duval extension wu of w longer than 2\w\ — 2 is trivial. 
This bound is thight and also gives a new bound on the relation between the 
length of an arbitrary word w and its longest unbordered factors n{w), namely 
that Iwl > 3fj,{w) — 2 implies d{w) = /r(w) as conjectured by Assous and Pouzet 
in [1]. Assous and Pouzet also gave the following example 

Example 15. Let w = a"6a”+^6a"6a”+^&a"6a"+^6a” . 

with |ix;| = 7n + 10 and ti{w) = 3n + 6 and d{w) = 4n + 7. 

We have that the precise bound for the length of a word that implies d{w) = n{w) 
is larger than 7/3fx(w) — 4 and smaller than 3ii(w) — 1. The characterization of 
the precise bound of the length of a word as a function of its longest unbordered 
factor is still an open problem. 

Finally, we would like to mention that after our proof was first made public 
in [12] an alternative proof of Conjecture 1 [13] and finally also of Theorem 2 [14] 
was proposed by Stepan Holub. Those proofs use a different technique relying 
on lexicographic orders and is shorter than the original one presented here. 

We think that our poof provides a more detailed insight into the structure 
of a nontrivial Duval extension by examining those words closely, and might 
therefore be useful for answering further questions on this subject. 



Acknowledgements. We would like to thank the anonymous referees for their 
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Abstract. We give a positive solution to the so-called finite substitution 
problem which was open for more than 10 years [11]: given recognizable 
languages K and L, decide whether there exists a finite substitution 
a such that u{K) = L. For this, we introduce a new model of weighted 
automata and show the decidability of its limitedness problem by solving 
the underlying Burnside problem. 



1 Introduction 

The present paper is the first one in a series of papers in which we will introduce 
new models of weighted automata to solve important decision problems in the 
theory of recognizable languages. The problems we address include the so-called 
limitedness problem, solved independently by K. Hashiguchi, H. Leung, and 
I. Simon [1,7,13], Burnside type problems in semigroup theory, and the so-called 
finite substitution problem, which was open for more than 10 years [11]: given 
recognizable languages K and L, decide whether there exists a finite substitution 
cr such that a{K) = L. Our main result is a solution to this problem. 

Our tools rely on the approach by H. Leung and I. Simon involving distance 
automata [7,8,12,13]. However, this concept proved to be insufficient for our 
purpose and led us to introduce a different class of automata, the desert automata 
which are non-deterministic finite automata with a set of marked transitions. The 
weight of a path is defined as the length of a longest subpath which does not 
contain a marked transition. The weight of a word is the minimum of the weights 
of all successful paths of the word. The second main result of the paper states 
that it is decidable whether the range of the mapping of a desert automaton is 
finite which is a counterpart of the corresponding result for distance automata 
[1,7,13]. Finally, we obtain some partial results on the complexity. 
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2 Overview 

2.1 Preliminaries 

For sets M, we denote by V{M) the power set of M, and we denote by Vf{M) 
the set of all finite subsets of M. 

Within the entire paper, we fix some n > 1 which is used as the dimension 
of matrices. Whenever we do not explicitly state the range of a variable, then 
we assume that it ranges over the set {1, . . . , n}. For example, a phrase like “for 
every i,j” is understood as “for every i,j € { 1 , ■ • ■ ,n}”. 

2.2 Distance Automata 

K. Hashiguchi introduced the notion of a distance automaton motivated by his 
research on the star height hierarchy in 1982 [1]. A distance automaton is a tu- 
ple A = [Q, E, I, F, A], where [Q, E, I, F] is a non-deterministic finite automaton 
and A : E ^ {0, 1} is a mapping called distance function. Let A = [Q, E, I, F, A] 
be a distance automaton. The distance function is extended to paths: the dis- 
tance of a path 7T is defined as the sum of the distances of all transitions in tt and 
denoted by A{tt). The distance of a word w G if* is denoted by A{w) and defined 
as the minimum over the distances of all successful paths labeled with w. 

A distance automaton is limited if there is a d G N such that A{w) < d 
for every w G L{A). The problem whether a distance automaton is limited is 
decidable [1,3,7,8,13] and PSPACE-complete [9]. 

2.3 Desert Automata 

A desert automaton is a tuple A = [Q,E,I ,F,E~^] where [Q,E,I,F] is a non- 
deterministic finite automaton and E'^ C E are called water transitions. Let 
A = [Q,E,I,F,E'^] be a desert automaton. Its language L{A) is defined as 
the language of [Q, E, I , F]. We call A unambiguous (resp. deterministic) if 
[Q,E,hF] is unambiguous (resp. deterministic). 

Let 7 T be a path in A. We call tt' a subpath of tt if there are paths tti , 7 T 2 in A 
satisfying tt = 7 ri 7 r' 7 T 2 . We denote by A{tt) the length of a longest subpath of tt 
which does not contain any water transition. The intuition behind this definition 
is that we imagine tt as a path through a desert. We intend to walk along tt. 
We carry a water tank, but this tank does not last the entire path. Whenever 
we come along a water transition, we can fill up the tank, and the tank has to 
last until we meet the next water transition. We can understand A{tt) as the 
required capacity of the tank to walk along the path tt. 

For every w G S* , let A{w) = min{Z\(7r)|p G /, g G F, tt G p <7}, where 
p q denotes the set of all paths from p to q with the label w. A desert 
automaton is limited if there is a d G N such that A{w) < d for every w G L{A). 

Example 1. Consider the desert automaton A\ with Qi = Ii = F\ = {<71 }, 
El = {(< 7 i,a,( 7 i), (gi, 6 ,gi)}, and Ef = {{qiA,qi)}- For every w G E*, Ai{w) is 
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the largest k such that w G S*a^S* . Similarly, there is a desert automaton A2 
such that for every w G S* , ^2(10) is the largest k such that w G S*b^S*. 

The disjoint union of A\ and A2 yields a desert automaton A such that 
for every w G S*, A{w) = min(z\i(r(;), A2{w)). For every fc G N, we have 
A{a^b^) = k, and thus, A is not limited. However, for every u,v,w £ S* , k £ N , 
we have A{uv^w) < |uwt(;|, i.e., the sequence {A{uv^w))k>i is bounded. 

By Example 1, a pumping condition does not suffice to guarantee limitedness 
of desert automata. On the other hand, a pumping condition is sufficient to 
guarantee limitedness of unambiguous desert automata, and thus, the mapping 
A in Example 1 cannot be computed by an unambiguous desert automaton. In 
[4], we show by a bounded variation argument, that there are mappings which 
can be computed by unambiguous but not by deterministic desert automata. 
Consequently, the classes of mappings which are computable by deterministic, 
unambiguous, resp. arbitrary desert automata form a strict hierarchy. 

Another difficulty in the research on desert automata is that in contrast to 
distance automata, the distance of a path tt = 7ri7r2 cannot be calculated from 
A{tt\) and A{tt2)- We just have max(Z\(7ri), Z\(7T2)) < A{tt) < A{tti) + A(7r2). 
Thus, we cannot use the tropical semiring [13] and we develop the notion of word 
matrices and related results in Section 3. As a main result we show: 

Theorem 1. It is decidable and PSP ACE-hard whether a desert automaton is 
limited. 

Here, we just prove the decidability. The proof of PSPACE-hardness is an easy 
adaptation of H. Leung’s proof of the same result for distance automata [4,7]. 

3 An Algebraic Framework for the Limitedness Problem 

We develop an algebraic framework to show the decidability of the limitedness 
problem of desert automata. For the rest of the paper, let A = [Q,E,I,F,E'^] 
be a desert automaton, let n = [Q], and assume Q = {1 , . . . , n}. 

3.1 Finite Semigroup Theory 

We assume that the reader is familiar with basic notions on semigroups [10]. 
Let S' be a finite semigroup. The sets of idempotent (resp. regular) elements of 
S are denoted by E(S) (resp. Reg(S)). For every m > 1, we call ai, . . . ,am £ S 
a smooth product if Oi =g . . . ^ =3 (oi . . . 0^) G Reg(S). 

We call a mapping H : E(S) — >■ E(S) consistent if we have for every e, / G E(S), 
a,b £ S satisfying e ^ / and / = aeb, /** = ae^b. It is shown in [4] that a 
mapping is consistent iff we have for every a, 6 G S with a6, 6a G E(S) (a6)® = 
a{ba)^b. It was already observed in [7] that every consistent mapping j] admits 
an extension to jl : Reg(S) -G Reg(S) by setting for every e G E(S) and c, d G S 
satisfying e =3 ced, (ced)** = ce^d. 

Let a G Reg(S). There are e, / G E(S) with e =r a =l /, i.e., ea = a = af. 
Thus, e^a = a** = af^, and moreover a** <l a and a** <r a. 

If a, 6 G S are a smooth product, then (ab)'^ = a'^b^ = a'^b = ab^ [4]. 
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3.2 The Semigroup of Word Matrices 

As observed above, we cannot describe desert automata by matrices over the 
tropical semiring. Thus, we develop the notion of word matrices. 

Consider the alphabet D = {Y,/)(\}. The symbol Y (resp. It\) should be 
pronounced “water” (resp. “desert”). The words over D represent paths in desert 
automata. Consider the semiring D = (Vf{D'^) U {w}, U, •), where w is a new 
element and U and • are extended from Vf(D^) by setting for every X G D\{0}, 
LO ■ X = X ■ uj = LO, u)\JX = X\Jlo = X, and further, 0 • w = a; • 0 = 0, 
0Uw = wU0 = w. The natural ordering on D is set inclusion extended from 
Pf(D^) in a way that oj is between the empty set and the singletons. 

We call a matrix A with entries in D a word matrix if there is some k such 
that every word in A has the length k, i.e., for every i,j and every tt G A[z,j] 
we have |7 t| = k. We denote by D„xn the semigroup of all nxn word matrices. 
Note that D„xn is not closed under U. 

Later, we will use the free semigroup V)^xn D„xn- We denote by a the 
natural homomorphism from D)()xn onto D„xn- 

3.3 On the Semantics of Desert Automata 

We give another method to define the semantics of desert automata using word 
matrices. We define a homomorphism 0 : E'^ — >■ , by setting for every tran- 

sition e G E, 0(e) = Y (resp. /X\) if e is a water transition (resp. e is not a water 
transition). We can assign every word w G A+ a matrix 0(w) G D„xn by setting 
9{w)[i,j] = 9{i ^ j). Clearly, 9 : — >• D„xn is a homomorphism. 

For two paths tt, tt' with 9(tt) = 9(tt'), we have A(tt) = A(tt'). Hence, we 
can define A on D+ by A(tt) = max{ ? G N | tt G D* /X\* D*} and we have 
for every path tt, A(7t) = A(0(7t)). We extend A from D+ to D by setting 
A(X) = min{Z\(7r) | tt G Xj for X ^ co and A(oj) = oj. 

We have another definition of the semantics of desert automata by setting 
for every w G A+, A(w) = min| A(^9{w)[i,j]) \ i G I, j G F} . This definition is 
equivalent to the definition in Section 2.3 up to the empty word. 

3.4 The Small Desert Semiring 

Let T> = {Y,/X\,a;,oo}. Intuitively, /A represents a path without water, Y repre- 
sents a path with water, and oo means that there is no path. We define on T> an 
operation • as the maximum over the ordering A\ C Y C w C oo. This operation 
corresponds to the concatenation of paths. Clearly, • is idempotent, and (V, •) is 
a monoid with identity /A and zero oo. 

We define an operation min on 22 over the ordering Y < fA < uj < oo. Clearly, 
(22, min,-) is a semiring. We denote by 22„xn the semiring of all nxn-matrices 
over 22. Consider the homomorphism E : (D, U, •) — >■ (22, min, •) defined by 

Y if An{Y,A\}* Y {Y,/X\}* / 0, 
fA if A is a nonempty subset of /X\^, 

LO if X =LO, 

OO if A = 0. 

It extends to a homomorphism E : (D„xrn •) (E>nxm •)• 
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3.5 Strange Limits 

We define the notion of a 'F -limit. It is not a classic limit notion, because it is 
not based on a metric and the limit of a sequence does not belong to the same 
algebraic structure as the members of the sequence. A S'-limit of some sequence 
over D describes in terms of T> how the sequence is bounded. We will extend 
this notion in a natural way to word matrices, and we prove some results which 
enable us to use if'-limits in the same way as traditional limit concepts. 

Recall that some sequence {qk)k>i is a subsequence of {pk)k>i if there is a 
strictly increasing mapping / : N — >■ N such that qk = Pf{k) for every fc > 1. 

A sequence {xk)k>i G (N U {oo}) is said to be hounded, if there are l,K > 1 
such that Xk < K for every k > 1. It tends to infinity, if for every K > \ there 
is some I > 1 such that for every k > I we have Xk > K. 

Let {Xk)k>i G D be a sequence. We define the !f'-limit F of {Xk)k>i- 

LI. If there is an I > I such that Xk = 0 for every k > I, then F{Xk)k>i = oo. 
In this case, we call {Xk)k>i an oo -sequence. 

For every z G {Y, /X\}, A G D, let A(A, z) = min{ Afn) | tt G A, F{tt) = z }, 
where min0 = oo. We denote the sequence (z\(Afc, z)) by A{Xk,z)k>\. 
Assume that there an Z > 1 such that Xk yf 0 for every k > 1. We define 

L2. If A{Xk,Y)k>i is bounded, then we define F{Xk)k>i = Y. 

L3. If A{Xk,Y)k>i tends to infinity and A{Xk, ^)k>i is bounded, then we 
define F{Xk)k>i = ^- _ 

L4. If Z\(Afc, Y)fe>i and A{Xk, ^)k>i tend to infinity, then F{Xk)k>i = w. 

If we can apply one of these four definitions to a sequence {Xk)k>i, then we call 
{Xk)k>i a convergent sequence. Otherwise, F{Xk)k>i is not defined. We denote 
the set of all convergent sequences by £(D). Every sequence contains a conver- 
gent subsequence. Every constant sequence (Afc)fc>i is convergent and we have 
F{Xk)k>i = F{Xi). For sequences over D, we define U and • componentwise. 

Lemma 1. [4] 1. Every subsequence of a convergent sequence is convergent and 
converges to the same F-limit. 

2. The set of convergent sequences is closed under componentwise U and ■, 
and F : (£(D),U, •) — >■ (P, min,-) is a homomorphism. 

The notion of a iF-limit and a convergent sequence extends naturally to matrices. 
By Lemma 1(2), F : £(D„xn) — >■ Ttnxn is a homomorphism. 

For every subset T G D„xn we denote by F{T) the set of all !F-limits of 
all convergent sequences over (T). We have F{{T)) C F{T). We formulate the 
limitedness problem of desert automata by using the notions of a iF-limit. 

Proposition 1. Let A = [Q, E, I, F, E'^] he a desert automaton and T = 9{E). 
The following assertions are equivalent: 

1. A is not limited. 

2. There is a matrix a G F{T) such that min| a[i,j] | z G /, j G F } = u). 
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Proof (sketch). (2)=i>(l) There is a sequence {wk)k>i such that 'P{0{wk))k = a. 
For every K gN, there is a large k G N with w G L{A) and A{wk) > K. 

(1)^(2) Let {wk)k>i € L{A) such that < A{w 2 ) < . . . Let (Bfc)fc>i 

be some arbitrary convergent subsequence of {0{wk))k>i and a = 

Clearly, Prop. 1 is rather another formulation of the limitedness problem of desert 
automata than a solution. To give an algorithm for the limitedness problem of 
desert automata, we show some method to compute tf'(T) by avoiding to examine 
the possibly uncountable set £(T). 

4 The Solution of the Burnside Problem 

In this section, we solve the Burnside problem for word matrices. Our main 
strategy follows H. Leung’s approach [8] to a similar problem for the tropical 
semiring. However, there are great differences in the proof details, because we 
consider a more involved semiring and another notion of stabilization. 



4.1 Stabilization 

We define a mapping ji : E(T>nxn) — >■ Prixu which we call stabilization. For every 
e G E(T>„xn) and i,j let 



e^[i,j] 



oo if e[t,j] = oo 

Y if there is some I such that e[z, 1] = e[l, 1] = e[l,j] = T 
Lo otherwise. 



Remark 1. Let i, I such that e[i, I] = /)^ and e[l, 1] = Y. Then, e^[z, 1] = Y ^ e[i, 1]. 
Thus, such i and I cannot exist, and similarly, it is impossible that for some l,jj 
e[l,l] = Y and e[l,j] = 

If for some i,j e[i,j]^ = Y, then e[z,j] = = Y. 

Lemma 2. Let e G E(T>nxn)- If e.[i,j] ^ for every i,j, then e = eK 

Proof. Let i,j be arbitrary. If e[i,j] G {w,oo}, then e[i,j] = 

Assume e[i,j] = Y. By = e, there are i = io, . . . ,in +2 = j, such that 
for I G {1, . . . ,n+2} we have G {Y,/)^}, i.e., = Y. There are 

p < q G {I, . . . , n+I} satisfying ip = iq. Then, we have e[i, ip] = e[ip, ip] = e[ip, j] 
and e*[i, j] = Y. 

We state the main result of Section 4. For subsets M C T>nxn we define (M)^ 
as the least subset of T>nxn which contains M and is closed both under matrix 
multiplication and stabilization ft of idempotent matrices. 

Theorem 2. Let T C D„xn be finite. We have L'{T) = {'P(T))K 
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4.2 Stabilization Is a Consistent Mapping 

We establish a first connection between stabilization and if'-limits of sequences. 

Proposition 2. Let E G D«xn such that 'L{E) G E(2?„xn)- The sequence 
(E>^) k>i is convergent and E{E^)k>i = E{E)K 

Proof (sketch). Let e = T{E). Let i,j be arbitrary. 

If e^[i,j] = oo, then for every fc > 1, e^[i,j] = oo and E^[i,j] = 0, and thus, 
'E{E^)k>i = oo. In the rest of the proof, we assume e*[i, j] yf oo, i.e., E^[i,j] yf 0 
for every fc > I. 

At first, we show that if e^[i,j] = Y then j], Y)fc>i is bounded, and 

thus, 'E{E^[i,j])k-yi = Y. Since e^[i,j] = Y, there is some I with e[i,l] = e[l,l] = 
= A. There are tti G E[iJ], tt 2 G E[l,l], G E[l,j] such that E^tti) = 
'T{tt 2 ) = ^{ 773 ) = Y. For every k > 2, we have 7ri(7r2)^“^7T3 G and 

L\(7ri(7r2)''“^7T3) < |7ri7r27T3|. Thus, j], Y)fc>i is bounded. 

Finally, we deal with the case e^[i,j] = uj. We assume by contradiction that 
A{E^[i,j], Y)k>i does not tend to infinity. There is some iL G N such that we 
have A{E’^[i,j], Y) = K for infinitely many k. Choose some k > (A'+l)(n+l) + l 
with A{E^[i, j],Y) = K. Let tt G E^[i,j] with L\(7 t) = K and tF(7r) = Y. By 
counting arguments, there are 0 < p < q < k and I with (p+ K) < q such that 
7T G EP[i,l] E'^~P[l,l] E^~'^~P[l,j], i.e., we can factorize tt into tti, 7T2, 7T3 which 
belong to E^[i,l], E'^~p[1,1], E'^~'^~p[ 1, j], respectively. We have |7T2| > K, and 
since A{tt) = K, tt 2 contains a water transition. Thus, Y = e'^~^[l,l] = e[l,l]. 
Similarly, we obtain e[i,l], e[l,j] G {Y,/)(\}, and by Remark 1, we have e[i,l] = 
e[l,l] = e[l,j] = Y, i.e., e*[z, j] = Y, which is a contradiction. 

For every A; > 1, we have A{E^[i, j]^ N\) > k, because every word in E^ is 
at least of length k. Thus, A{E^[i, j], tends to infinity. To sum up, if 

e^[i,j] = UJ, then E{E^[i,j])k>i = co. 

Lemma 3. Let T C D„xn finite. For every idempotent e G T{T), we have 
e# G F{T). 

Proof. There is a sequence {wk)k>i G T+ with e = 'F{a{wk))k>i- By subse- 
quence selection, it suffices to consider the cases that {wk)k>i is strictly length 
increasing and that w\ = W 2 = ■ . . If {wk)k>i is strictly length increasing, then 
H\ does not occur in e**, and by Lemma 2, e** = e G 'F{T). If wi = W 2 = . ■ ■ , then 
let E = a{wi) G (T). By Prop. 2, we have e** = F{E^)k>i G F{T). 

Lemma 4. Stabilization is a consistent mapping. 

Proof Let e G E(2?„xn)- Let E G D„xn with F{E) = e. By Prop. 2 and 
Lemma 1, e»e» = W{E>^)k>iT{E^)k>i = F{E>^E%>i = F{E%>i = e«, i.e., 
G E(T>nxn)- Let a, 6 G Pnxn with ab, ba G E(2?„xn)- Let A,Bg D„xn with a = 
<F(A), b = tIb). Then, (ab)^ = F{{AB)^)k>i = a'F{{BA)^)k>i b = a{ba)^b. 

Lemma 5. Let a G Reg(T>„xn) o,nd i,j be arbitrary. We have a^[i,j\ yf H\. 

If a[i,j] G {w,oo}, then af[i,j] = a[i,j]. If a[i,j] = Y, then a^i,j] G {Y,w}. 

If a[i,j] = ft\, then a'^[i,j] = uj. 

Proof (sketch). Let e G E{T>nxn) with e =l a, i.e., a = ae, = aeK The proof 
follows by an examination of the product a e** and Remark 1 . 
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4.3 On the Growth of Entries 

We call w = G ^ smooth product if 'P {Ax) , . . . {A\^\) is a 

smooth product. We extend the distance function A. 

1. For A € D„x„, let A(A) = ^(^[hj])- 

2. For m . . . ^ 1^1 G D+x„, let A{w) = A{Ai). 

3. For T = {■u;i,...,W|T|} C D+x„, let Z\(T) = maXig{i_,„jT|} 

Proposition 3. Let w G D^xn ® smooth product and i,j he arbitrary. 

1. If P{a{w))^[i,j] = Y, then Z\(a('u;)[i, jO < 2-A{w). 

2. If P{a{w))^[i,j] = u, then A[a{w)[i,j]) > - 1. 

Proof (sketch). Let w = A\ . . . A|^„| and ai = P{Ai) for every ^ G {1, . . . , |tu|}- 

(1) We have Y = (m . . .a\^\)^[i,j] = {a{ . . . af^|)[i, j]. There are i = io, . . . , 

i\w\ = j such that for every ? G {1, . . . , |w|} a\[ii-x,ii] G {Y, /X\}. By Lemma 5, 
we can conclude for every ^ G {1, . . . , |w|} = Y. Choose 

7T; G such that A{tti) is minimal. Since = Y and because 

the words in A[[ii-i,ii] are of the same length, we have P{tti) = Y, and of course 
Z\(7T;) < A(w). Thus, Z\(7ri . . .7T|,i,|) < 2 • A(w) which proves (1). 

(2) We assume |w| > 4”^n. Let tt G a{w)[i,j]. By a counting argument, 

there are 1 < fc < Z < |w| and p such that {I — k) > — 1, a^+i ■ . .ai G 

E(T’nxn), and we can factorize tt into tt = 7ri7r27T3 for some tti G {Ax . . . Ak)[i,p\, 
7T2 G {Ak+x ■ ..Ai)[p,p], and tts G {Ai+x ■ ..A\^()[p,j]. If !f(7r2) = Y, then we have 
P{a{w))^i, j] ^ oj. Thus, 7T2 G /X\+, i.e., A{-k) > Z\(7T2) = |7T2| = l — k > — 1. 

4.4 The Proof of Theorem 2 

In order to complete the proof of Theorem 2 by showing P{T) C {P{T))^, we 
define a relation to compare word matrices. Let iF > 1. Let X,Y G D. We denote 
X T if we have the following assertions: 

1. IfX = 0, thenT = 0. 

2. If X 0, then X AY 

3. X and Y “agree in their bounded words”, i.e., { tt G X | A{tt) < X } AY. 

In particular, for X G D with X 0 and A{X) > X, we have X Ak y, but we 
do not have X Ak 0- We generalize Ak componentwise to matrices in D„xn- 
It is easy to prove that Ak is stable w.r.t. matrix multiplication. 

We extend stabilization to word matrices. For matrices A G Dnxn, we define 
the stabilization A^ if 'P{A) G Reg(2?„xn) as follows: 

4#r- 

V if nA)n^,d] = Y. 

This definition is correct by Lemma 5. Note that if A^ is defined, then A^ G D„xn 
and we have P{A^) = P{A)^ and P{A^)[i,j] y^ /X\ for every i,j. 
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Proposition 4. Let K >2. Let T he some finite subset of D„xn- 

There is some xk > 1 such that for every w G T+ there is a B G Dnxn satisfying 

'T{B) G {T{T))^, a{w) <k B, and 2^{B) < xk- 

We should spend some attention to the conditions A{B) < xk and a(tu) <k B. 
Let i,j be arbitrary. If a{w)[i,j] G {w, 0}, then a{w)[i,j] = B[i,j], If we have 
A{a{w)[i, j]) < K, then a{w)[i,j] A B[i,j] but a{w)[i,j] and B[i,j] agree in 
their bounded words (cf. 2, 3 in the definition of A k ). If A{a{w)[i,j]) > XK, 
then a{w) Ak B and A{B) < xk together imply B[i,j] = ui. 

We establish the following lemma to prove Prop. 4. 

Lemma 6. Let K > 2 and x > 1 be arbitrary. Let I' A I C be two 

ideals o/('f'(T))® such that L\L' is a ,1-class of {T{T))K 

There is some x' > 1 such that for every w = Ai . . . G D^xn satisfying 

Al. G ('f'(T))#, 

A2. A{w) < X, 

A3. For every / G {1, . . . , |w| — 1}, T{AiAi+i) G I , 

there is some v = B\ . . . B|„| G D^xn satisfying a(w) Ak a(v) and 

Cl. G MT))#, 

C2. A{v) < x' , 

C3. For every ^ G {1, . . . , |t;| - 1}, G T. 

In particular, this assertion is true for x' = 2-4" n{K + 2)x. 

At first, note the similarity between the assumptions (Al), (A2), (A3) and the 
claims (Cl), (C2), (C3). This similarity enables us to apply Lemma 6 inductively 
on a chain of ideals 0 C . . . C /2 C C (<F{T))'^ to prove Prop. 4. In the last 
step of this induction, /' is empty, and thus, claim (C3) implies that v has the 
length 1, and v is exactly the matrix B which we require to prove Prop. 4. 

Proof (Lemma 6). Let K, x, and w as in the lemma. 

We factorize w into words vi,V 2 , - ■ ■ , Vm- If 'T{Ai) G then let v\ = Ai and 
proceed with A 2 . . . A|„,|. If F(Ai) ^ then let v\ be the longest prefix of w 
satisfying F(a(yi)) ^ I' and proceed with the remaining part of w. 

In this way, we achieve some m > 1 and v\, ... ,Vm G D)(xn such that 

1. Al . . . A|„| = Vi . . .Vm (concatenation of words) 

2. F{a{vi)), . . . ,<F{a{vm)) G {T{T))^ 

3. For every I G {1, . . . , m — 1}, F{a{vivi+i)) G I' (by construction of vi) 

4. For every I G {1, . . . , m} with |?;;| > 1, we have 'F{a{vi)) G / \ /'. 

Let I G {1, . . . , m} be arbitrary. 

Case 1: |n;| < 2-4”"n(A + 2) 

We set Bi = a{vi). Then, a{vi) Ak Bi and Bi satisfies (Cl). Moreover, 
A{Bi) < |?;;| • A{vi) < |z;;| • Z\(w) = x', i.e., Bi satisfies (C2). 
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Case 2: |u,| > 2-4”"n(K: + 2) 

We set Bi = a{vi)K However, we have to ensure 'I/{a{vi)) £ Reg((^(T)}^). 
We denote vi as vi = V\ .. . V|„,|. We transform vi into a word u. If is 
even, then we set u = a(ViV 2 )a(V 3 V 4 ) . . . a(V|„,|_iV|„,|). Otherwise, u = 
a(yiH 2 )a(V 3 K 4 ) . . .a(V|„,|_ 2 V|„,|_iV|^,|). Clearly, a{vi) = a{u). 

We have |m| > (4”^n+l)(KT+2). We denote the letters of u by u = . . . C/|„|. 

By (A3), we have for every k G |u|} 'P{Uk) G I. If 'P{Uk) G I', 

then 'F{a{u)) G /' and 'P{a{u)) = ^{a{vi)) G I' which contradicts (4), 
above. Hence, 'P{Uk) € I \ I' for every k G |m|}. Thus, / \ /' is a 

regular J-class of (^'(T))**, and hence, m is a smooth product. Moreover, 
'F{a{u)) = 'B{a{vi)) G Reg((tf'(T))t*), i.e., a(w/)* is defined, and = 

^{a{vi)^) =^{a{vi)f G (<f'(T))». 

We show a{u) <k a{u)'^. We know that a{u) and a{u)^ are only different in 
entries i,j for which 'I^{a{u)Y[i,j] = io. Let i, j with 'F{a{u)Y[i,j] = u. By 
Prop. 3(2) and |m| > (4”^n+l)(AT + 2), we get A{a{u)[i, j]) > ~ 1 > AT, 

and thus, o;(tt) :<k a{u)^, i.e., a{vi) = a(u) :<k a{uY = Oi{viY = Bi. 

Now, we take care on (C2) for Bi. Let i,j be arbitrary. Assume that Bi[i,j] 
contains some path. By Bi = a{u)^, B[[i,j] contains some water transi- 
tion, i.e., 'F{Bi)[i,j] = Y. By Prop. 3(1) and A{u) < 3 • A{w), we obtain 
A{Bi[i,j]) = A{a{u)[i, j]) < x' . Hence, Bi satisfies (C2). 

We show (C3). By (3), 'lf{a{vi))'I'{a{vij^i)) G I' for every I G {!,..., m—1}. By 
the definition of Bi, we have 'P(Bi) <l ^{a{vi)) and ^{Bi^i) <r <f'(a(w/+i)), 
i.e., ^{BiBi+i) = ^{Bi)<I'{Bi+i) <j i'{a{vi))<I'{a{vi+i)) G I'. 

We show a{w) :<k a{v). In both case 1 and 2, we have seen a{vi) <k a{Bi). 
By the stability of :<k w.r.t. matrix multiplication it follows a{w) :<k a{v). 

Proof (Prop. 4-)- Let 2 ; < 4"^ be the number of J-classes of {^{T))K 

Let w G T"*". We apply Lemma 6 z times over a chain of ideals {'P{T))^ = 
Li 2 ■ ■ ■ 2 2 ^z+i = 0- Initially, x = A{T) and (Al, A2, A3) are satisfied. 

In the last application of Lemma 6 we achieved a word v with |w| = 1 by (C3). 
We set B = V and xk = x' {x' from the last application of Lemma 6). 

Proof (Theorem 2). We show (tf'(T))t* C T{T). We have tf'(T) C 'T{T), because 
for every A gT, T{A) is the if'-limit of (A)fc>i. Moreover, T{T) is closed under 
multiplication (Lemma 1) and stabilization of idempotents (Prop. 3). 

We show T{T)C(T{T))K Let {wk)k>i G T~^ . We assume {a{wk))k>i £€{T) 
and denote a = T{a{wk))k>i- We have to show a G {T{T))K 

By subsequence selection, there is a AT > 1 such that for every i,j and I > 1: 

1. If a[i,j] = 00 , then a{wi)[i,j] = 0, i.e., T{a{wi))[i, j] = 00 . 

2. If a[i,j] G {Y,/X\}, then A{a{wi)[i, j]) < K and 'I'[a{wi))[i, j] = a[i,j]. 

Let Xk be from Prop. 4. There is a word w in {wk)k>i such that for every i,j 
with a[i,j] = CO, we have A{a{w)[i, j]) > xk- We apply Prop. 4 on ic and obtain 
B G D„xn- Let i,j be arbitrary. 

Assume a[i,j] = Y. By (2), there is some tt G a{w)[i,j] with T{tt) = Y and 
A{tt) < K. By a{w) Ak B, we have tt G B[i,j], i.e., T(B[i,j]) = Y = a[i,j]. 




Desert Automata and the Finite Substitution Problem 



315 



Assume a\i,j] = N\. As above, there is a tt G B\i,j] with = N\ and 
A{ti) < K, i.e., G {Y,/X\}. If B[i,j] = Y, then we have (by a(w) :<k B) 

'I/{a{w))[i,j] = Y which contradicts (2). Hence, 'I/{B[i,j]) = H\ = a[i,j]. 

Assume a[i,j] = to. By (1), a{w)[i,j] yf 0. By the choice of w, we have 
A(7 t) > xk for every tt G a{w)[i,j]. Consequently, a{w) <k B and A{B) < xk 
imply B[i,j] = uj, i.e., <F{B[i,j]) = uj = a[i,j]. 

Finally, if a[i,j] = oo, then we have W{B[i,j]) = oo = a[i,j] in the same way. 
To sum up, we have ^{ce{wk))k>i = o, = ^{B) G (>F(r))>*. 

Proof (Theorem 1.). We combine Prop. 1 and Theorem 2. 



5 On the Finite Snbstitntion Problem 

To simplify some technical details, we forbid the empty word. In [4], we show 
the same result for free monoids. Let Ei and S 2 be two alphabets. A mapping 
(T : Ai — >■ is called a, finite substitution. Every finite substitution extends 

to a homomorphism a : V{S(') — > P(A^). 

Theorem 3. It is decidable whether for two given recognizable languages 
K C Sf and L C , there exists a finite substitution a such that <j{K) = L. 

Proof. Let 77 : — >■ S{L) be the syntactic morphism of L. We call every 

homomorphism r : P{S(') — >■ V{S{L)) with t{K) = rj{L) a type. There are just 
finitely many types, and there is an algorithm which computes the list of all 
types. A substitution a is of type r if rj{a{a)) C r(a) for every a G Ai. Every 
finite substitution a with cr(K) = L is of the type rj o a. Hence, it suffices to 
decide the existence of a finite substitution cr of a given type t with a{K) = L. 

Let Ak = [Qk, Ek, Ik, Fk] be an automaton which recognizes K. For every 
t = {p,a,q) G E, we construct a desert automaton At = [Qt, Et,p,q' , Efi] with 
L{At) = Et C (Q(\g') x A 2 x (Qt\p), and = E* n (Q* x A 2 x g') . 

We define A = [Q, E, Ik, Fk, E^]. We replace in Ak every t = (p, a, q) G Ek 
by At and identify p and q with the initial and accepting state of At. The key 
argument is that there is a finite substitution with u(K) = L if and only if 
L = L{A) and A is limited [4]. □ 

6 Next Research Steps 

The next step is to develop an automata concept which includes desert and 
distance automata as two extremal cases and to solve the limitedness problem 
of the new automata concept which allows a new proof for the decidability of 
the star height one problem [2,5]. 

Finally, we would like to address two problems on desert automata. 

Is the limitedness problem for desert automata in PSPACE? Prop. 1 and 
Theorem 2 give an algorithm with time complexity 2^^*-” \ where n is the number 
of states. One of the mostly examined problems on distance automata is to find 
a sharp upper bound on the range of the distance function of limited distance 
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automata. The bound shown in [3] and recently improved in [9] allows to show 
that the limitedness problem for distance automata is in PSPACE [9]. The hope 
is to find such a bound for desert automata. 

It is undecidable whether two given distance automata compute the same 
mapping [6], but this problem is open for desert automata. 



Acknowledgments. The author acknowledges the discussions with Jean-Eric 
Pin on the paper. 
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Abstract. The satisfiability and not-all-equal satisfiability problems for 
boolean formulas in CNF with at most two occurrences of each variable 
are complete for deterministic logarithmic space. 



1 Introduction 

The satisfiability problem (SAT) for formulas of propositional logic in conjunc- 
tive normal form (CNF) is the canonical complete problem for the complexity 
class NP [1] of nondeterministic polynomial time. Similarly, SAT problems re- 
stricted to several subclasses of CNF formulas are complete for smaller complex- 
ity classes. 

For Horn formulas, i.e., CNF formulas where every clause contains at most 
one positive literal, satisfiability is complete for deterministic polynomial time 
P [2]. For formulas in 2-CNF, i.e., formulas where every clause contains at most 
two literals, satisfiability is complete for nondeterministic logarithmic space NL 
[3]. We exhibit the first known natural special cases of SAT that are complete 
for deterministic logarithmic space L. 

Let CNF (2) be the class of formulas F G CNF such that every variable 
occurs at most twice in F, and let SAT(2) be the problem SAT restricted to 
instances in CNF(2). It is well-known that SAT(2) can be decided in linear 
time (see e.g. the book by Kleine Biining and Lettmann [4]). We will show that 
SAT(2) is complete for L. 

The not-all-equal-satisfiability problem (NAE-SAT) is a variant of SAT that 
is studied in many contexts. Given a formula in CNF, the question is whether 
there is a satisfying assignment that also falsifies at least one literal in every 
clause. 

In general, NAE-SAT is NP-complete for those classes of CNF-formulas for 
which also SAT is NP-complete. NAE-SAT restricted to formulas in 2-CNF is 
complete for symmetric logarithmic space SL [3,5]. Recently, Porschen et al. 
[6] have shown that NAE-SAT(2), defined analogously to SAT(2), is solvable 
in linear time, and is in the parallel complexity class NC, their proof actually 
shows it is computable in parallel logarithmic time by a nearly linear number of 
processors, and thus is in AC^. We will show here that NAE-SAT(2) is in, and 
in fact complete for L. 
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It should be noted that our logarithmic space algorithms, in contradistinc- 
tion to the algorithms mentioned above, only solve the decision problems SAT(2) 
and NAE-SAT(2), they do not give a witnessing assignment in case of a posi- 
tive answer. However, after a draft of this paper was circulated, Stephen Cook 
(personal communication) and Mark Braverman [7] have given algorithms to 
construct satisfying assignments for satisfiable CNF (2) -formulas in logarithmic 
space. 

It is easily checked that all the reductions we construct can be written as first- 
order reductions, given the usual encoding of the problem instances as logical 
structures (see Immerman [8] for background on these notions.) Therefore, all 
our reductions are uniform AC° many-one reductions. 



2 Satisfiability 

In this section we show the L-completeness of SAT(2). To this end, we reduce 
SAT(2) to a problem on a certain class of graphs: 

A tagged graph G = (V, E, T) is an undirected multigraph (V, E) with a 
distinguished set T C F of vertices. We refer to the vertices in T as the tagged 
vertices. 

We call a connected component in G tagged, if it contains at least one tagged 
vertex, and untagged otherwise. 

From a formula F G CNF (2), we construct a tagged graph G(E) as follows: 

— G(E) has a vertex vc for every clause G in E. 

— If clauses G and E contain a pair of complementary literals x and x, then 
there is an edge e^ between vc and V£>- 

— If C contains a pure literal, i.e., a literal a such that the complementary 
literal d does not occur in E, then vc is tagged. 

Note that there can be parallel edges between clauses containing more than one 
pair of complementary literals. 

The assignment of a value to a variable x in E corresponds to giving the edge 
6x in G{F) a direction, from the clause containing the literal among x, x that 
gets the value 1 to the one that gets the value 0. Thus a clause G is satisfied by 
an assignment if vc has nonzero outdegree. 

Since clauses that contain pure literals can always be satisfied, the following 
characterization of satisfiability is rather obvious: 

Proposition 1. A formula F G CNF(2) is satisfiable iff the edges in G{F) can 
be directed so that in the resulting directed graph, there is no untagged sink. 

This characterization leads us to the following lemma: 

Lemma 2. A formula F G CNF(2) is satisfiable iff every connected component 
in G{F) contains a tagged vertex or a cycle. 
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Proof. It suffices to show that the condition on the right-hand side is equivalent 
to the condition from Proposition 1. Since it is obviously necessary, we only need 
to show it is sufficient. 

Let a connected component C of G{F) contain a tagged vertex v. Perform 
a depth-first-search of C starting from v, and direct every edge in the resulting 
tree towards the root v. This way, every vertex in C other than v will have an 
outgoing edge, so the only sink is v, which is tagged. The back-edges can be 
directed arbitrarily. 

If a connected component C contains a cycle Vi,V 2 ,... ,Vk, then we direct 
the edges around the cycle. To obtain the direction of the other edges, perform 
a depth- first-search starting from vi,V 2 , - ■ ■ ,Vk in order, but during the search 
from Vi, do not visit the vertices Vj for j > i. In the resulting forest, direct as 
above all edges in every tree towards the root Vi . This way, every vertex in C will 
have an outgoing edge, and the remaining edges can be directed arbitrarily. □ 

In other words, F is unsatisfiable iff G{F) contains a connected component that 
is an untagged tree. 

Theorem 3. SAT(2) is in L. 

Proof. It suffices to show that the condition in Lemma 2 can be verified in 
logarithmic space. We employ a technique that was used by Cook and McKenzie 
[9] to test in logarithmic space whether a graph is acyclic. 

For a tagged graph G = (V,E,T), let D(G) := {{v,e) ; e incident on v} be 
the set of darts of G, i.e., the ends of edges in G. For a dart d = {v, e) G D(G), we 
denote v hy v{d) and e by e(d). We consider permutations of the set D(G). The 
disjoint-cycle representations of the following two permutations can be easily 
constructed from G: 

PG is the product of the cycles ((u, ei) ... (v,ek)) for every vertex v, 
where ei, . . . ,6k are all the edges incident on v. 

ao is the product of the transpositions {{v,e) (u, e)) for every edge e, 
where e is an edge between vertices u and v. 

By a result of Cook and McKenzie [9], from the disjoint-cycle representations 
of two permutations, one can compute the representation of their product in 
logarithmic space. 

Hence we can obtain the disjoint-cycle representation of the product ttg = 
PG°ctg- We will show how, using this representation of ttg, we can decide whether 
G contains a connected component that is an untagged tree. 

We start a search from every dart d G D(G). If the search is successful for 
every d, then we accept, otherwise we reject. 

The search procedure performs two nested walks of the graph along the orbits 
of ttg. The outer walk is started at wi := d, then the inner walk is started at 
W 2 ■= wi. It repeatedly remembers e' := e{w 2 ), and then sets W 2 ■= ttg{w 2 ), 
until either a tagged vertex is found, i.e., v{w 2 ) G T, or the walk returns to wi, 
i.e., v{w 2 ) = v{wi). In the first case, the search terminates successfully. In the 
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second case, the search is successful if the walk did not return to v(wi) through 
e(wi), i.e., e' e(w 2 ). 

If none of these cases occur, then the outer walk is continued by updating 
wi := 7 tg(wi). If wi = d, then the search terminates unsuccessfully, otherwise 
the inner walk is started again. 

Note that the algorithm only stores two darts and one edge, so it runs in 
logarithmic space. The problem is therefore in L, since logarithmic space func- 
tions are closed under composition. To verify the correctness of the algorithm, 
we need to prove the following claim: 

Claim. For every dart d G D(G), the search from d terminates unsuccessfully if 
and only if the connected component of G containing v{d) is an untagged tree. 

The “if” direction is obvious. For the other direction, we use the following ob- 
servation: if for every d' in the orbit of d, the walk along ttg returns to v{d') 
through the edge e(d'), then the component of v{d) is a tree, which is seen as 
follows: 

If a vertex is reached through the edge e = ei, then the walk will traverse 
every other edge leaving v before returning on e. In fact, if 

((n,ei) (v,e 2 ) ... {v,ek)) 

is the orbit of (w,ei) in pc, then the walk will traverse the edges 62 , . . . Ck in 
that order before returning on ei: if Ui is the other vertex incident with e^, then 
TTcdui^ei)) = (v,ei+i). 

It follows inductively that the walk visits the entire component of v{d). It 
also follows that the component contains no cycle, by the following argument of 
Cook and McKenzie [9]: 

Let ui, . . . , Ufe, Vk+i = vi be a cycle, with edges e* between Vi and Vi+i, and 
with vi reached first through edge cq. By the above observation, for every i, at 
Vi+i the walk would traverse e^+i before returning on e^. Therefore, the walk 
returns through vi = Vk+i through yf Ci, in contradiction to the assumption. 

Therefore, if the search from d is unsuccessful, the component of v{d) is a tree, 
which is untagged, since the walk would have encountered any tagged vertices 
present. □ 

Let SAT(2)“ be the restriction of SAT(2) to instances that contain no pure 
literals, and let TF (tree-freeness) denote the following problem: 

TF: Given an undirected graph G, does every connected component in 
G contain a cycle? 

As a consequence of Lemma 2, we obtain the following equivalence: 
Proposition 4. SAT(2)“ is equivalent to TF. 

Proof. One direction is given by the construction above, which produces no 
tagged vertices when F contains no pure literals. 




Satisfiability Problems Complete for Deterministic Logarithmic Space 321 



For the other direction, we can reverse the reduction as follows: For an undi- 
rected graph G = (V,E), we construct a formula F{G) as follows: we introduce 
one variable Xe for every edge e G E, and for each vertex v G V, we construct a 
clause Gy that contains one literal for each edge e incident to v. This literal is 
Xe, if e connects u to a higher numbered vertex, and Xe otherwise. 

Obviously, E{G) is a formula in CNF(2) with no pure literals, and 
G{F{G)) = G, so by Lemma 2, the construction is a reduction from TF to 
SAT(2)-. □ 

Proposition 5. TF is L-complete. 

Proof. TF is in L by Proposition 4 and Theorem 3. Its L-hardness remains to 
be shown. 

We reduce the following problem UFA, which is known to be complete for 
L [9], to TF: Given an undirected forest G consisting of exactly two trees, and 
vertices u and v in G, are u and v in different trees? 

The reduction adds two new vertices to G, and connects them both by edges 
to u and v, as shown below, giving G' . 




Now if u and v are in the same tree, then the other tree is still a tree in G' . If u 
and V are on different trees, then G' has only one connected component, which 
contains a cycle. Thus the construction reduces UFA to TF. □ 

From Propositions 5 and 4 above, we get that SAT(2)“ is L-hard, therefore also 
SAT(2) is L-hard. Together with Theorem 3, this proves the main result of this 
section: 

Theorem 6. SAT(2) is L-complete. 

3 Not-All-Equal-Satisfiability 

We are now going to show the L-completeness of NAE-SAT(2). We first consider 
the problem for the special case of monotone formulas, which turns out to be 
equivalent to another problem on tagged graphs. 

Let an isolated clause be a unit clause such that the variable in this clause 
does not occur in any other clause. In this section we assume w.l.o.g. that for- 
mulas do not contain isolated clauses. This is possible, since no formula with an 
isolated clause is in NAE-SAT, and on the other hand such formulas are easily 
recognized. 

Let mCNF(2) be the class of monotone formulas in CNF(2), i.e., formu- 
las that contain only positive literals, and let mNAE-SAT(2) be the restric- 
tion of NAE-SAT(2) to instances in mCNF(2). Whereas satisfiability is trivial, 
NAE-SAT is NP-complete even for monotone formulas. 
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For a formula F G mCNF(2), we define the tagged graph G'{F) by 

— G'{F) has a vertex vq for every clause G in F. 

— If clauses C and D contain the same literal x, then there is an edge Cx 
between vq and vo- 

— If C contains a literal, that does not occur in another clause, then vc is 
tagged. 

Let E2C (edge-2-colorability) denote the following problem: 

E2C: given a tagged graph G = (E, E, T), can the edges in G be colored 
by two colors such that every untagged vertex v G V \ T has incident 
edges of both colors. 

The following characterization of mNAE-SAT(2) is rather obvious. 

Proposition 7. A formula F G mCNF(2) is in NAE-SAT iff G'{F) is in E2C. 

Note that for a formula F with an isolated clause, the graph G'{F) con- 
tains a tagged isolated vertex. If an isolated clause is added to a formula 
F G NAE-SAT (2) “, then the resulting formula F' is no longer not-all-equal 
satisfiable, whereas G'{F') is in E2C. Thus our assumption is needed for the 
equivalence to hold. 

In fact, we can show that the two problems are equivalent. 

Proposition 8. mNAE-SAT(2) is equivalent to E2C. 

Proof. One direction is Proposition 7. For the other direction, given a tagged 
graph G = (V,E,T), we define a formula F{G) G mCNF(2) as follows: for 
every edge e G E, there is a variable Xg. For every vertex we form a clause 
Gy containing the variables Xg for the edges e incident on v. Finally, for every 
tagged vertex v gT, we add a variable Xy to the clause Gy. It is easily seen that 
G'(F(G)) = G, and hence by Proposition 7, the construction reduces E2C to 
mNAE-SAT(2). □ 



Lemma 9. An undirected graph G is in E2C iff the following two conditions 
hold: 

1. every untagged vertex has degree at least two, and 

2. there is no untagged connected component that is a simple odd length cycle. 

Proof. Both conditions are obviously necessary. To see that they are sufficient, 
we first show the following claim: 

Claim. If the conditions above hold, then every untagged component G contains 
either an even length cycle, or two edge-disjoint odd cycles. 
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Start a walk from some vertex on C, that never leaves a vertex on the same edge 
it came from, which is possible by condition 1 . Since C is finite, we must find a 
cycle Z that way. Either Z is of even length, or else by condition 2 there must be 
a vertex v on Z of degree at least 3 . Start another walk leaving v on an edge not 
on Z. Again, this walk must end in a cycle Z' . Now either Z' is of even length, 
or otherwise it either is edge-disjoint from Z, or it shares a common part with 
Z. But in the latter case, the cycle following Z and Z' , leaving out the common 
part, is of even length. 

The task to show that a graph satisfying the two conditions can be edge- 
colored, can now be split into three subtasks, to show how to color each type of 
connected component. 

Claim. Every tagged component can be edge-colored. 

This is shown by induction on the number of vertices in the component. The 
induction basis is trivial. 

For the induction step, let a tagged component C be given, and let w be a 
tagged vertex in C. We modify C by deleting v and all incident edges, and by 
tagging all neighbors of v. The result C is a union of several smaller tagged 
components, which can be colored by the induction hypothesis. This coloring 
can be extended to a coloring of ( 7 : if for a neighbor u oi v, all incident edges 
in C receive the same color, then we give the edge between u and v the other 
color. By induction, any tagged component can be colored. 

Claim. A component C that contains an even length cycle can be edge-colored. 

We color the edges around the cycle by alternating colors. For a vertex on the 
cycle, the incident edges other than the two cycle edges can now be colored 
arbitrarily. We therefore modify C by deleting the edges in the cycle, and by 
tagging the vertices on the cycle. The result is a union of tagged components, 
which can be colored by the previous case. Thus we can color all of C . 

Claim. A component C that contains two edge-disjoint odd length cycles Z\ and 
Zi can be edge-colored. 

Choose vertices v\ on Zi and V2 on Z2 that are connected by a simple path P 
(possibly of length 0 .) As in the previous claim, it suffices to color the edges on 
Zi, Z2 and P. We color the two edges on Z\ incident with v\ by the same color 
X, and the two edges on Z2 incident with V2 by x^ where x = if C* is of odd 
length, and x' otherwise. 
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The coloring can now be completed by coloring P and the rest of Z\ and Z2 by 
alternating colors. □ 
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From this characterization we see that E2C G L, by the following algorithm: 
First check that condition 1 holds, which is easy. Then for every dart d = (v, e) G 
D{G), start a walk leaving v via e as in the above proof, until either a tagged 
vertex or a vertex of degree at least 3 is found, in which case the walk terminates 
successfully. If neither happens before the walk returns to v, then v lies on a 
simple cycle, thus we count the number of steps in the walk to decide whether 
the cycle is of even or odd length, and terminate with success or not accordingly. 
By Proposition 8, we obtain the following result: 

Proposition 10. mNAE-SAT(2) is in L. 

We now show that the general case is in L as well: 

Theorem 11. NAE-SAT(2) is in L. 

Proof. We reduce NAE-SAT(2) to E2C. The definition of G'{F) is extended to 
non- monotone formulas in CNF (2) by adding the clause: 

— if C and D contain complementary literals x and x, then we add a new 
vertex Vx and connect it to vc and vd as shown below. 

o — • o 

VC Vx VD 

The presence of the vertex Vx enforces that the two edges get different colors, 
therefore F G CNF(2) is in NAE-SAT iff G'{F) is in E2C. □ 

Proposition 12. E2C is L-complete. 

Proof. We reduce the following problem DCA, which is L-complete by a result 
of Cook and McKenzie [9], to E2C: given a permutation tt, and two points a and 
b, do a and b lie on the same orbit of tt? 

The reduction produces a graph G{tt) as follows: there are two vertices c and 
c' for each point c, plus two extra vertices a” and 6". In the graph G{tt), every 
c other than a, b is connected to 7t(c) by a path of length 2 going through c', 
as shown below. Similarly, a is connected to 7r(a) by a path of length 3 going 
through a' and a" , as shown below, and analogously for b. 

o — • — o 

C 7t(c) 

o— o 

a 7r(a) 

Note that G'(7t) consists of disjoint cycles corresponding to the orbits of tt. Now 
if a and b lie on the same orbit, then G{tt) has only even length cycles, thus 
is in E2C. Otherwise G(7t) has two odd cycles, thus is not in E2C. Thus the 
construction reduces DCA to E2C, and hence E2C is L-hard. We have shown 
E2C G L above, therefore E2C is L-complete. □ 

From Propositions 12 and 8 above, we get that mNAE-SAT(2) is L-hard, 
therefore also NAE-SAT(2) is L-hard. Together with Theorem 11, this proves 
the main result of this section: 

Theorem 13. NAE-SAT(2) is L-complete. 
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Abstract. How difficult is it to find a path between two vertices in 
finite directed graphs whose independence number is bounded by some 
constant fc? The independence number of a graph is the largest number 
of vertices that can be picked such that there is no edge between any two 
of them. The complexity of this problem depends on the exact question 
we ask: Do we only wish to tell whether a path exists? Do we also wish to 
construct such a path? Are we required to construct the shortest path? 
Concerning the hrst question, it is known that the reachability problem is 
hrst-order definable for all fc. In contrast, the corresponding reachability 
problems for many other types of finite graphs, including dags and trees, 
are not hrst-order dehnable. Concerning the second question, in this 
paper it is shown that not only can we construct paths in logarithmic 
space, but there even exists a logspace approximation scheme for this 
problem. It gets an additional input r > 1 and outputs a path that is at 
most r times as long as the shortest path. In contrast, for directed graphs, 
undirected graphs, and dags we cannot construct paths in logarithmic 
space (let alone approximate the shortest one), unless complexity class 
collapses occur. Concerning the third question, it is shown that even 
telling whether the shortest path has a certain length is NL-complete 
and thus as difficult as for arbitrary directed graphs. 



1 Introduction 

Finding paths in graphs is one of the most fundamental problems in graph 
theory. The problem has both practical and theoretical applications in many 
different areas. For such problems we are given a graph G and two vertices s 
and t, the source and the target, and we are asked to find a path from s to t. 
This problem comes in different versions: The most basic one is the reachability 
problem, which just asks whether such a path exists. This problem is also known 
as ‘accessibility problem’ or ‘s-t-connectivity problem’. The construction problem 
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asks us to construct a path, provided one exists. The optimization problem asks 
us to construct not just any path, but the shortest one. Closely related to the 
optimization problem is the distance problem, which asks us to decide whether 
the distance of s and t is bounded by a given number. If the optimization problem 
is difficult to solve, we can consider the approximation problem, which asks us 
to construct a path that is not necessarily a shortest path, but that is only a 
constant factor longer than the distance of s and t. 

In this paper it is shown that for directed graphs whose independence num- 
ber is bounded by some constant k the reachability problem, the construction 
problem, and the optimization problem have fundamentally different compu- 
tational complexities. The paper extends a previous paper [18] that treated 
only the reachability problem. The main contribution of the present paper is 
a logspace approximation scheme for the optimization problem and a proof that 
the distance problem is NL-complete. This paper presents the first example of 
an optimization problem that cannot be solved optimally in logarithmic space 
(unless L = NL), but that can be approximated well in logarithm space. Ap- 
proximation theory has traditionally focused on polynomial-time computations; 
mostly because approximation algorithm are typically only sought for if comput- 
ing optimal solutions turns out to be NP-hard, but also because computing any 
solution and computing an optimal solution seemed to have the same complexity 
for the problems considered in small space complexity theory. 

The independence number a{G) of a graph G is the maximum number of 
vertices that can be picked from G such that there is no edge between any 
two of these vertices. The most prominent examples of graphs with bounded 
independence number are tournaments [17,20], which are directed graphs with 
exactly one edge between any two vertices. Their independence number is 1. 
The reachability problem for tournaments arises naturally if we try to rank or 
sort objects according to a comparison relation that tells us for any two objects 
which ‘beats’ the other, but that is not necessarily acyclic. 

A different example of graphs with bounded independence number, studied 
in [5], are directed graphs G = {V,E) whose underlying undirected graph is claw- 
free, i.e., does not contain the for some constant m, and whose minimum 

degree is at least |I^|/3. Their independence number is at most 3m — 3. 

To get an intuition on the behaviour of the independence number function, 
first note that independence is a monotone graph property: adding edges to a 
graph can only increase, deleting only decrease the independence number. Given 
two graphs with the same vertex set and independence numbers a and a' , the 
independence number of their union is at most the minimum of a and a' and the 
independence number of their disjoint union is a + a' . Thus if a graph consists 
of, say, four disjoint tournaments with arbitrary additional edges connecting 
these tournaments, its independence number would be at most 4. Intuitively, a 
graph with a low independence number must have numerous edges and, indeed, 
at least ( 2 ) / edges must be present in any n-vertex graph G. This 

abundance of edges might suggest that if paths between two given vertices exist, 
there should also exist a short path between them. While this is true for the 
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undirected case, in the directed case (which we are interested in in this paper) 
the distance between two vertices can become as large as n — 1 even in n- vertex 
tournaments. 



1.1 How Difficult Is It to Tell Whether a Path Exists? 



The reachability problem for finite directed graphs, which will be denoted reach 
in the following, is well-known to be NL-complete [11,12] and thus easy from a 
computational point of view. The complexity of the reachability problem drops 
if we restrict the type of graphs for which we try to solve it. The reachabil- 
ity problem reacHu for finite undirected graphs is SL-complete [14] and thus 
presumably easier to solve. The even more restricted problem REACHforest for 
undirected forests and the problem REACHout<i for directed graphs in which 
all vertices have out-degree at most 1 are L-complete [4]. Here and in the fol- 
lowing ‘completeness’ always refers to completeness with respect to the restric- 
tive -reductions, i.e., many-to-one reductions that can be computed by a 
family of logspace-uniform constant-depth circuits with unbounded fan-in and 
fan-out [2,3]. 

The complexity of the reachability problem for finite directed graphs whose 
independence number is bounded by a constant k is much lower: somewhat sur- 
prisingly, this problem is first-order definable for all k, as shown in [18]. Formally, 
for each k the language REACHa<fe := reach fl {{G,s,t) \ a{G) < k} is first- 
order definable, where ( ) denotes a standard binary encoding. Languages whose 
descriptive complexity is first-order are known to be very simple from a com- 
putational point of view. They can be decided by a family of logspace-uniform 
AC*^-circuits [15], in constant parallel time on concurrent-read, concurrent-write 
parallel random access machines (CRCW-PRAMs) [15], and in logarithmic space. 
Since it is known that L-hard sets cannot be first-order definable [1,6], REACHo,<fc 
is unconditionally easier to solve than reach, reacHu, and REACHforest- 

When studying the complexity of a graph problem, one usually assumes (as 
done above) that the input graph is encoded as a binary string ‘in some standard- 
ized way’. Which particular way of encoding is chosen is of little or no concern 
for the computational complexity of the problem. This is no longer true if the 
input graphs are encoded succinctly, as is often the case for instance in hard- 
ware design. Succinctly represented graphs are given indirectly via a program 
or a circuit that decides the edge relation of the graph. Papadimitriou, Yan- 
nakakis, and Wagner [19,23,24] have shown that the problems SUCCINCT-reach, 
SUCCINCT-REACHu, SUCCINCT- REACHforest, and SUCCINCT-REACHout<l are all 
PSPACE-complete. Opposed to this, SUCCiNCT-REACHo,<fc is -complete for 
all k, see [18] once more. 



1.2 How Difficult Is It to Construct a Path? 

The low complexity of the reachability problem seemingly settles the complexity 
of finding paths in graphs with bounded independence number. At first sight, 
the path construction problem appears to reduce to the reachability problem 
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via a simple algorithm: Starting at the source vertex, for each successor of the 
current vertex check whether we can reach the target from it (for at least one 
successor this test will be true); make that successor the current vertex; and 
repeat until we have reached the target. Unfortunately, this algorithm is flawed 
since it can lead us around in endless cycles for graphs that are not acyclic. 
A correct algorithm does not move to any successor, but to the successor that 
is nearest to the target. This corrected algorithm does not only produce some 
path, but the shortest one. However, the algorithm now needs to compute the 
distance between two vertices internally, which is conceptually a more difficult 
problem than deciding whether two vertices are connected. 

Nevertheless, we shall see that a path between any two connected vertices 
can be constructed in logarithmic space in graphs with bounded independence 
number. There even exists a logspace approximation scheme for this problem. 
This means that for each r > 1 and each k there exists a logspace-computable 
function that maps an input (G,s,t) with a{G) < fc to a path from s to t of 
length at most r times the distance of s and t. If no path exists, the function 
outputs ‘no path exists’. 



1.3 How Difficult Is It to Construct the Shortest Path? 



How difficult is it to construct the shortest path in a graph with bounded inde- 
pendence number? We show that, again surprisingly, even for tournaments this 
problem is as difficult as constructing the shortest path in an arbitrary graph. 
As pointed out above, the complexity of constructing the shortest path hinges 
on the complexity of the distance problem DiSTANCEtoum := {{G,s,t,d) \ G is 
a tournament in which there is a path from s to t of length at most d}. This 
problem is shown to be NL-complete. Thus distance and DiSTANCEtoum are 
-equivalent, but reach and REACHtoum are not. The succinct version of 
DiSTANCEtoum is shown to be PSPACE-complete. 



1.4 Organization of This Paper 

This paper is organized as follows. In Section 2 graph-theoretic terminology and 
known results on the reachability problem for graphs with bounded indepen- 
dence number are reviewed. In Section 3 a logspace approximation scheme for 
the shortest path problem for graphs with bounded independence number is 
presented. In Section 4 the distance problem for tournaments is shown to be 
NL-complete and its succinct version is shown to be PSPACE-complete. 



2 Review of Known Resnlts 

In this section graph-theoretic terminology and known results on the reachability 
problem in graphs with bounded independence number are reviewed. 

A ( directed ) graph is a nonempty set V of vertices together with a set E C 
U X U of directed edges. A graph is undirected if its edge relation is symmetric. A 
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tournament is a graph with exactly one edge between any two different vertices 
and (f , v) ^ E for all v € V. A forest is an undirected, acyclic graph. A tree is 
a connected forest. 

A path of length £ in a graph G = {V, E) is a sequence (uq, . . ■ ,Vi) of distinct 
vertices with {vi, Ui+i) € E for i G {0, . . . , ^ — 1}. A vertex t is reachable from a 
vertex s if there is a path from s to t. The distance d{s,t) of two vertices is the 
length of the shortest path between them or oo, if no path exists. For t G N, a 
vertex u G V is said to i-dominate a vertex v G V if there is a path from t6 to u of 
length at most i. A set U C V is an i-dominating set for G if every vertex v G V 
is t-dominated by some vertex u G U. The i-domination number Pi{G) is the 
minimal size of an i-dominating set for G. A set 17 C y is an independent set if 
there is no edge in E connecting vertices in U . The maximal size of independent 
sets in G is its independence number a{G). 

Fact 2.1 ([18]). Let G = (V,E) be a finite graph with at least two vertices, 
n := \V\, a := a{G), and c := {o? + a)/{a^ -I- a — 1). Then 

1. (3\{G) < [log^n] and 

2. / 32 (G) < O'. 

For tournaments G, Fact 2.1 yields /3i(G) < [log 2 n] and /32(G) = 1. The 
first result was first proved by Megiddo and Vishkin in [16], where it was used 
to show that the dominating set problem for tournaments is not NP-complete, 
unless NP C DTIME[n'^*-^°®"^] . The second result is also known as the Lion 
King Lemma, which was first noticed by Landau [13] in the study of animal 
societies, where the dominance relations on prides of lions form tournaments. It 
has applications in the study of P-selective sets [9] and many other fields. 

The next fact states that the complexity of the reachability problem for 
graphs with bounded independence number is low: REACHo,<fc is first-order de- 
finable for all k. First-order definability is a language property studied in de- 
scriptive complexity theory. It can be defined as follows for the special case of 
languages A C {(V,E,s,t) | (V,E) is a finite graph, s,t G V}: Let r = (E^,s,t) 
be the signature of graphs with two designated vertices. A first-order r-formula 
is a first-order formula that contains, other than quantifiers, variables, and con- 
nectives, only the binary relation symbol E and the constant symbols s and t. An 
example is the formula 3a;[E(s, x) A E(x, t)] . A r-structure is a tuple {V, E, s, t) 
consisting of a graph {V,E) and two vertices s,t G V. A r-structure is a model 
of a T-formula if the formula holds when we interpret the relation symbol E as 
the edge relation E and the constant symbols s and t as the vertices s and t. 
For example, the r-formula 3a;[E(s,x) AE(x,t)] is a model of every r-structure 
(V,E,s,t) in which there is a path from s to t in the graph {V,E) of length 
exactly 2. The language A is first-order definable if there exists a r-formula (j) 
such that {V, E, s,t) G A iff {V, E, s, t) is a model of (f. 

Fact 2.2 ([18]). For each k, REACHQ<fe is first-order definable. 

The complexity of the reachability problem for graphs with bounded indepen- 
dence number is also interesting in the succinct setting. Succinctly represented 
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graphs are given implicitly via a description in some description language. Since 
succinct representations allow the encoding of large graphs into small codes, 
numerous graph properties are (provably) harder to check for succinctly rep- 
resented graphs than for graphs coded in the usual way. Papadimitriou et al. 
[19,24] have shown that most interesting problems for succinctly represented 
graphs are PSPACE-complete or even NEXP-complete. The following formaliza- 
tion of succinct graph representations follows Galperin and Wigderson [7], but 
others are also possible [24,8]. 

Definition 2.1. A succinct representation of a graph G = ({0, 1}", if) is a 2n- 
input circuit C such that for all u,v € {0, 1}" we have {u,v) G E iff C{uv) = 1. 

The circuit tells us for any two vertices of the graph whether there is a directed 
edge between them or not. Note that C will have size at least 2n since it has 2n 
input gates. 

Definition 2.2. Let A C {(G,s,t) | G = {V,E) is a finite graph, s,t G V}. 
Then SUCCINCT-A is the set of all codes {C, s, t) such that C is a succinct rep- 
resentation of a graph G with {G,s,f} G A. 

Fact 2.3 ([18]). For each k, SUCCINCT -REACHo,<fc is II 2 -complete. 

3 Complexity of the Approximation Problem 

In this section it is shown that for graphs with bounded independence number we 
can not only tell in logarithmic space whether a path exists between two vertices, 
but we can also construct such a path. While it seems difficult to construct the 
shortest path in logarithmic space (by the results of the next section this is 
impossible unless L = NL), it is possible to find a path that is approximately 
as long as the shortest path. Even better, there exists a logspace approximation 
scheme for constructing paths whose length is as close to the length of the 
shortest path as we would like: 

Theorem 3.1. For all k there exists a deterministic Turing machine M with 
read-only access to the input tape and write-only access to the output tape such 
that: 

1. On input {G, s, t, m) with (G, s, f) G REACHo,<fc and m> 1, it outputs a path 
from s to t of length at most (1-1- 1/m) d{s,t). 

2. On input {G,s,t,m) with {G,s,f} ^ REACHo,<fc it outputs ‘no path exists’. 

3. It uses space O(logmlogn) on the work tapes, where n is the number of 
vertices in G. 

For the proof of the theorem we need two lemmas. The second lemma is a 
‘constructive version’ of Savitch’s theorem [21]. 

Lemma 3.2. There exists a function in FL that maps every input (G,s,t) G 
REACHforest to the shortest path from s to t in G and all other inputs to ‘no path 
exists ’. 




332 



T. Tantau 



Proof. The problem REACHforest is L-complete as shown in [4] . In order to com- 
pute the shortest path from s to t we iterate the following procedure, starting 
at s: For each neighbour v of the current vertex, we check whether t is reachable 
from V in the forest obtained by removing the edge connecting the current ver- 
tex and V. There is exactly one vertex for which this test succeeds. We output 
this vertex, make it the new current vertex, and repeat the procedure until we 
reach t. □ 

Lemma 3.3. There exists a deterministic Turing machine M with read-only 
access to the input tape and write-only access to the output tape such that: 

1. On input (G,s,t) G REACH it outputs a shortest path from s to t and uses 
space 0(logd(s, f) logn) on the work tapes, where n is the number of vertices 
in G. 

2. On input (G, s, f) ^ reach it outputs ‘no path exists’. It uses space 0(log^ n) 
on the work tapes, where n is the number of vertices in G. 

Proof. We augment Savitch’s algorithm [21] by a construction procedure that 
outputs paths. If there are several paths, the procedure ‘decides on one of them’ 
and does so ‘within the recursion’. 

Let reachable be Savitch’s procedure for testing whether there is a 

path from m to of length at most t. For £ = 1, it checks whether {u,v) G 
E or u = V. For larger £, it checks whether for some vertex z both the calls 
reachable{u, z, [^/2J) and reachable{z,v,£— [I?/2J) succeed. As noted by Savitch, 
we can compute reachable {u, v, £) in space 0(log £ log n) since we can reuse space. 

We next define a procedure construct {u,v,£) that writes a path of length £ 
from u to V onto an output tape, provided reachable{u,v,£) holds. In order 
to simplify the assemblage of outputs of different calls to construct, the last 
vertex of the path, i.e., the vertex v, will be omitted. For £ = 1, construct 
simply outputs u. For larger £, it finds the first vertex z for which both the calls 
reachable{u, z, [^/2J) and reachable{z,v,£ — [^/2J) succeed. For this vertex z it 
first calls construct {u, z, [^/2J) and then construct{z,v,£ — \£/2\). 

The machine M iteratively calls reachable{s,t,£) for increasing values of £. 
For the first value £ for which this test succeeds, it calls construct{s, t, £), appends 
the missing vertex t, and quits. If the tests do not succeed for any £< n, it 
outputs ‘no path exists’. □ 

Proof (of Theorem 3.1). Let an input (G,s,t,m) be given. Let G = (V,E) and 
n := \V\. For a set U of vertices let d{U,t) := min{d(w, f) | u G U}. 

We first check, in space O(logn), whether (G,s,t) G REACH„<fc holds and 
output ‘no path exists’ if this is not the case. Otherwise we enter a loop in 
which we construct a sequence Ui, C/ 2 , . . . ,Ue CV of vertex sets with Ui = {s} 
and Ui = {t}. For the construction of C/j+i we access only Ui and use space 
O(logmlogn). Once we have constructed Ui+i we erase Ui and reuse the space 
it occupied. 

The set Ui is obtained from Ui-i as follows: If d{Ui-i,t) < 2m -\- 1, let 
Ui := {t}. Otherwise let Si := {v € V \ d{Ui-i,v) = 2m-|-2} and choose Ui C Si 
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as a 2-dominating, size-fc vertex subset the graph G' := (Si,E n {Si x Si)) 
induced on the vertices in Si. Since a{G') < a{G) < k, such a 2-dominating 
set Ui exists by Fact 2.1. We can obtain it in space O(logmlogn) since the 
question ‘u € Si?’ can be answered in space O(logmlogn) using the procedure 
reachable from Lemma 3.3. 

The sets Ui have the following properties for i G {2, . . . , £ — 1}: 

1. All elements of Ui are reachable from s. 

2. \Ui\ < k. 

3. d{Ui-i,u) = 2m + 2 for all u G Ui. 

4. d{Ui,t) < d{Ui-i,t) — 2m and hence d{Ui, t) < d(s, t) — 2m{i — 1). 

To see that the last property holds, note that d{Ui,t) < d{Si,t) + 2 and that 
d{Si,t) = d{Ui-i,t) — 2m — 2. For i = £, the first two properties are also true 
and the third one becomes d{Ui-i,t) < 2m -I- 1. 

Intuitively, in each iteration we reduce the distance between Ui and t by at 
least 2m and each can be connected to the next Ui by a path of length 
2m -|- 2. It remains to explain how to connect the Ui’s correctly. 

In order to output the desired path from s to t of length at most (1 -I- 
1/m) d{s,t), we first construct a forest that contains this path. The forest is not 
actually written down anywhere (we are allowed only a logarithmic amount of 
space). Rather, as in the proof of FL being closed under composition, the forest’s 
code is dynamically recalculated in space O(logmlogn) whenever one of its bits 
is needed. Finding the shortest path in a forest can be done in logarithmic space 
by Lemma 3.2, and the shortest path in the forest will be the desired path. 

To define the forest F, for each i G {2, ...,£} we first define a ‘small’ forest 
Fi as follows: For each u G Ui it contains the vertices and edges of the shortest 
path from Ui-i to u. This path is constructed by calling the machine M from 
Lemma 3.3 on input (G,u',u) for the first vertex u' G C/i-i for which d{u',u) 
is minimal. Since d{u',u) < 2m + 2, this call needs space O(logmlogn). The 
graph Fi is, indeed, a forest since if two paths output by M for the same source 
vertex split at some point, they split permanently. Let F be the union of all the 
forests Fi constructed during the run of the algorithm. This union is a forest 
since every tree in a forest Fi has at most one vertex in common with any other 
tree in a forest Fj with j ^ i. 

Consider the shortest path from s to t in the forest F. This path passes 
through all Ui. For i G {1, . . . , £} let rti G C/i be the last vertex of Ui on this path. 
The total length of the path is given by X)i=i d(ui, u^+i). We have d{ui,Ui+i) = 
2m -I- 2 for i G {1, ...,£— 2}. Thus the total length is 

(2 to + 2){e-2) + d{ue_i,t) = {‘^■m + 2){^-2) + d{Ui_i,t) 

< {2m + 2){£ — 2) + d(s, t) — 2m{£ — 2) 

= d(s, t) + 2{£ — 2) < d(s, t) + d(s, t) jm. 

For the two inequalities, we both times used the last property of by which 

d{Ui-i,t) < d(s,t) — 2m{£ — 2) and hence also 2{£ — 2)< d(s,t)/m. □ 
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The space bound from Theorem 3.1 is optimal in the following sense: Suppose 
we could construct a machine M' that uses space 0(log^“*^ mlogn) and achieves 
the same as M. Then DiSTANCEtoum G DSPACE[log^“'^ n], because M' outputs 
the shortest path for m = n + 1. The results of the next section show that this 
would imply NL C DSPACE[log^"" n]. 

4 Complexity of the Distance Problem 

In this section we study the complexity of the distance problem for graphs with 
bounded independence number. This problem asks us to decide whether the 
distance of two vertices in a graph is smaller than a given input number. It 
is shown that this problem is NL-complete even for tournaments and that the 
succinct version is PSPACE-complete. 

The distance problem is closely linked to the problem of constructing the 
shortest path in a graph: As argued in the introduction, we can construct the 
shortest path in graph if we have oracle access to the distance problem for this 
graph. The other way round, we can easily solve the distance problem if we have 
oracle access to an algorithm that constructs shortest paths. Because of this 
close relationship, the completeness result bashes any hope of finding a logspace 
algorithm for constructing shortest path in tournaments, unless L = NL. 

Theorem 4.1. The problem DiSTANCEtoum is ^Ij-complete. 

Proof. We show reach DiSTANCEtoum- Let an input {G,s,t) be given. 

Let G = (V,E) and n := \V\. The tournament G' = {V',E') is constructed as 
follows: The vertex set V is {1, ... x V . We can think of this vertex set as 
a grid consisting of n rows and n columns. There is an edge in G' from a vertex 
(ri,wi) to a vertex {r 2 ,V 2 ) iff one of the following conditions holds: 

1. T 2 = ri + 1 and ( 111 ,^ 2 ) G E\j{{v,v) \ v G P}, i. e., if vi and V 2 are connected 
in G or if v\ = V 2 , then there is an edge leading ‘downward’ between them 
on adjacent rows. 

2. ri = T 2 and v\ < V 2 , where < is some linear ordering on V, i. e., the vertices 
on the same row are ordered linearly. 

3. T 2 = Ti — 1 and (vi,V 2 ) ^ EU {(w,?!) | v € V}, i.e., if Vi and V 2 are not 
connected in G and if they are not identical, then there is an edge leading 
‘upward’ between them on adjacent rows. 

4. r 2 < ri — 2, i.e., all edges spanning at least two rows point ‘upward’. 

The reduction machine poses the query ‘Is there a path from s' = (I,s) to 
t' = (n,t) in G" of length at most n — 1?’ Clearly this query can be computed 
by a logspace-uniform family of AC*^-circuits. 

To see that this reduction works, first assume that there exists a path from 
s to t in G of length m < n — 1. Let (s, V 2 , . . ■ , Vm, t) be this path. Then ((1, s), 
(2, V 2 ), . ■ . , {m, Vm), {vn + 1, t), . . . , (n, t)) is a path in G' of length n — 1. 
Second, assume that there exists a path from s' to t' in G' of length m < n — 1. 
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Then to = n — 1 since any path from the first row to the last row must ‘brave 
all rows’ — there are no edges that allow us to skip a row. Let {v'^, . . . ,v'^) be 
this path. Then v[ = (i,Vi) for some vertices Vi € V. The sequence (vi, . . . ,w„) 
is ‘almost’ a path from s to t in G: For each i G {1, . . . , n — 1} we either have 
Vi = Vi+i or {vi^Vi+i) G E. Thus, by removing consecutive duplicates and loops, 
we obtain a path from s to t in G. □ 

By the above theorem, distance and DiSTANCEtoum are <^®°-equivalent, 
while REACH and REACHtoum are not. The ‘complexity jump’ from REACHtoum 
to DiSTANCEtoum is reflected by a similar jump for the succinct versions. 

Definition 4.2. Let SUCCINCT -DiSTANCEtoum denote the language that contains 
all coded tuples {C,s,t,d), where C is a circuit, s and t are bitstrings, and d is 
a positive integer, such that C is a succinct representation of a graph G with 
(^G,S,t,d') G DiSTANCEtoum- 



Theorem 4.3. SUCCiNCT-DiSTANCEtoum is PSPACE- complete. 
Proof. Since DiSTANCEtoum G NL, we have 

SUCCINCT-DISTANCEtoum G NPSPACE = PSPACE. 



For the hardness, let A G PSPACE be an arbitrary language and let M be a 
polynomial-space machine that accepts A. We show that A is -reducible 
to SUCCiNCT-DiSTANCEtourn- For an input X, let G denote the configuration 
graph of M on input x, let s be the initial configuration, let t be the (unique) 
accepting configuration, and let d be an (exponential) bound on the run- 
ning time of M on input x. Let G' be the tournament constructed in Theo- 
rem 4.1 and let G be an appropriate circuit that represents G' . Then a; G A iff 
{C,S,t,d) G SUCCINCT-DISTANCEtourn- 

The representing circuit G can be constructed by a logspace-uniform family of 
AC°-circuits. To see this, first note that the circuit G can easily be constructed in 
logarithmic space since G' is highly structured. For an appropriate construction, 
G will depend on x only in a very limited way: For each bit of x there is a 
constant gate in G that ‘feeds’ this bit to the rest of the circuit, which does not 
depend on x at all. Thus we can hardwire almost all of G into the AC°-circuit 
that computes it, only G’s constant gates must be setup depending on x. □ 



5 Conclusion 

The results of this paper extend the answer to the question ‘How difficult is it to 
find paths in graphs with bounded independence number?’ in two different ways. 
It was previously known that checking whether a path exists in a given graph 
can be done using AC°-circuits. In this paper it was shown that constructing a 
path between two vertices can be done in logarithmic space. Constructing the 
shortest path in logarithmic space was shown to be impossible, unless L = NL. 




336 



T. Tantau 



These results settle the approximability of the (logspace) optimization prob- 
lem ‘shortest paths in graphs with bounded independence number’. This mini- 
mization problem cannot be solved exactly in logarithmic space (unless L = NL), 
but it can be approximated well: there exists a logspace approximation scheme 
for it. As we saw, the space 0(log m log n) needed by the scheme for a desired ap- 
proximation ratio of 1 -|- 1 /m is essentially optimal — any approximation scheme 
that does substantially better could be used to show unlikely inclusions like 
NL C DSPACE[log^“'’ n]. Thus it seems appropriate to call the scheme a ^ fully 
logspace approximation scheme’ in analogy to ‘fully polynomial-time approxi- 
mation schemes’. 

The shortest path problem for tournaments is not the only logspace opti- 
mization problem with surprising properties: In [22] it is shown that the distance 
problem for undirected graphs is also NL-complete, while the reachability prob- 
lem is SL-complete. On the other hand, the distance problem for directed graphs 
is just as hard as the reachability problem for directed graphs. This shows that, 
just as in the polynomial-time setting, logspace optimization problems can have 
different approximation properties, although their underlying decision problems 
have the same complexity. 
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Abstract. We exhibit an NP-complete problem defined by an existen- 
tial monadic second-order (EMSO) formula over functional structures 
that is: 

1. minimal under several syntactic criteria (i.e., any EMSO formula 
that further strengthens any criterion defines a PTIME problem even 
if all other criteria are weakened); 

2. unique for such restrictions, np to renamings and symmetries. 

Our reductions and proofs are snrprisingly very elementary and simple 
in comparison with some recent similar results classifying existential 
second-order formnlas over relational structures according to their 
ability either to express NP-complete problems or to express only 
PTIME ones. 

Keywords: Computational complexity, descriptive complexity, finite 
model theory, second-order logic, NP-completeness, parsimony. 



1 Introduction and Main Results 

1.1 Which Formulas Express NP- Complete Problems? 

In the line of Fagin’s Theorem [5] which states that existential second order logic 
(ESO) captures the class NP, this paper studies the following natural question: 
what is (are) the most simple ESO sentence(s) that define(s) some NP-complete 
problem(s)? This question is somewhat related to two recent papers [7,4] that 
completely classified prefix classes of ESO over strings and graphs (and more gen- 
erally over relational structures) with respect to their ability to express either 
some NP-complete problems or only tractable (i.e., PTIME) ones. For example, 
it is easy to express an NP-complete problem over graphs, such as 3-colourability, 
in existential monadic second-order logic (EMSO) with only two first-order vari- 
ables. In contrast, one notices that ESO formulas that use only relation ESO vari- 
ables and only one first-order variable can only define easy (degenerate) proper- 
ties on relational structures. The situation completely changes if function symbols 
are allowed either in the input signature or among the ESO symbols. For exam- 
ple, ESO formulas with only one first-order variable x of one of the forms (1-2) 

(1) 3/Va: ^f{xJ^E) 

(2) {DJ) h 31/ Va: f^{x,f,U) 
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(where x is quantified over the finite domain D, E is a, binary relation symbol, 
/ and U are lists of unary function symbols and of monadic relation symbols 
respectively, and ■0 is quantifier-free) can express some NP-complete problems. 
More precisely, [9] has recently proved that formulas of form (1) exactly define 
graph problems (such as the Hamiltonian cycle problem) that are recognizable 
in nondeterministic linear time 0{n) where n is the number of vertices in the 
graph, and [1] states that any problem is linearly reducible to Sat iff it is linearly 
reducible to some problem expressible by some formula of the form (2) (see also 
[12]). Moreover, as proved in [1], it can be assumed that the unary functions / 
of the input structures are permutations: the class of such problems are called 
LIN-LOCAL since they are linearly reducible to local problems. 

In this paper, we exhibits a formula of the form (2) over functional structures 
that defines some NP-complete problem and is minimal for several criteria over 
the signature of input structures, the prefix and the matrix of the formula; 
more precisely, the further strengthening of any criterion makes the problem fall 
in PTIME even if all others criteria are weakened. Moreover, this problem is 
essentially unique, up to renamings and symmetries. 

Finally, in contrast with our results about functional structures, notice that 
the similar question of determining the minimal ESO formulas (with two first- 
order variables) that define NP-complete problems over relational structures is, 
to our knowledge, widely open and seems rather difficult to us: e.g., the unicity 
of such a formula is very dubious. 

1.2 Minimal Formulas for NP-Complete Problems 

We study the problem MiNq defined by the very simple EMSO formula (po of 
the particular form (2) that follows. 

Notation 1. Let po denote the {f, g}-formula in conjunctive normal form 
(CNF) 

(po : 3U 'ix ’4>o{x) where ipo is the conjunction 

ipo : {UxV U fx) A {-lUx V -'ll fx V -dJgx), 

and f, g are unary function symbols. Let i5o denote the following formula in dis- 
junctive normal form (DNF) which is logically equivalent to po 

00 : 3U Vx {Ux A -'ll fx) \/ {Ux A ->Ugx) \/ {->Ux A U fx). 

The problem MiNq is defined as the set of finite models (D,f,g) of po (or of So). 

We shall also study the following subproblems of MiNq: 

Notation 2. Define MiNi as the set of finite models {D, f,g) ofpo, where f and 
g are permutations of D. For some functional structure (D,f,g), let G{D,f,g) 
denote the graph {V, E) defined by V = D and E = {(a;, fx) : x € D} U {{x, gx) : 
X € D) U {{fx,gx) : x € D}. Define M 1 N 2 as the set of finite models (D,f,g) 
of Po, where f and g are permutations of D and G{D,f,g) is planar. 

Our main results use the following notations: 

Notation 3. The atoms of a formula are its atomic subformulas. Ln particular, 
the distinct atoms of po (or So) are Ux, Lf fx and Ugx. The length of a formula 
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is the total number of occurrences of atoms in it. The disjuncts of a DNF formula 
are called its anticlauses. 

Theorem 1 (NP-completeness). MiNq, Mini and M 1 N 2 are NP-complete. 
Theorem 2 (Minimality). If P ^ NP, (/?g (resp. Sq) is, for the syntactic cri- 
teria enumerated in the table below, a minimal EMSO formula in CNF (resp. in 
DNF) of the form 3U Vx if (where if is quantifier-free andx is a list of first-order 
variables) that defines an NP-complete problem over functional structures. 



input signature 


2 unary functions 


distinct atoms 


3 


EMSO symbols 


1 


clauses in CNF 1^0 


2 


FO variables 


1 


length of CNF ipo 


5 


compositions of functions 


0 


anticlauses in DNF i5o 


3 


equalities 


0 


length of DNF i5o 


6 



That means that if any of these criteria is strenghtened and the other criterias 
are weakened then the problem so defined is PTIME; e.g, any formula of the form 
3U Vx if with length of < 5 in CNF defines a PTIME problem. 

Theorem 3 (Unicity). If P^NP, (po (resp. So) is - up to symmetries - the 
unique minimal FMSO formula in CNF (resp. in DNF) of the form 3U Vx if 
(where if is quantifier-free) that defines an NP-complete problem over functional 
structures. The symmetrical formulas are obtained by any permutation of the 
terms x, fx and gx and by swap of U and ~<U in (po (resp. So). 

More precisely, all the symmetrical formulas of tpo essentially define the same 
minimal NP-complete problem over permutations (resp. planar permutations) 
structures. In case of (general) functional structures, one obtains essentially two 
minimal NP-complete problems: the one defined by tpo itself, and the one defined 
by the following formula (p'g, that is (po with terms x and gx permuted: 

(p'o : 3U Vx (Ugx \/ U fx) A {-'Ugx \/ -'ll fx\/ ->Ux) 

1.3 Minimal Formulas for #P-Complete Problems 

Besides NP-completeness, another important concept of the theory of complex- 
ity is #P-completeness [14]. It is also natural to look for a minimal logical 
formula that defines some ^P-complete problem. In this regard, it is well known 
that the generic reduction from any NP problem to Sat can (easily) be made 
parsimonious with a bijective and PTIME-computable correspondence between 
solutions. That means that the problem Sat not only “simulates” the decision 
process of any problem in NP but also “reproduces” the number of its solutions 
and the “structure” of this set of solutions. 

Notation 4. For any problem A in NP, let us denote by ffA the “natural” 
counting problem associated to A, i.e., the problem of counting the “natural” 
solutions of the instances of A. #P is the class of such counting problems; e.g., 
#Sat is the function which maps each propositional formula F to the number of 
assignments I over the variables of F such that I |= F; similarly, #MiNi is the 
function which maps each permutation structure S = {D, /, g) to the number of 
predicates U such that (S,U) |= Vx ifo{x). 
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We say that an ordered pair (p, /x) is a weakly parsimonious reduction from 
#A to #B if p is a PTIME reduction from A to B, p is a PTIME-computable 
function valued in positive integers such that for each instance ic of A we have 
^{solutions of A for ru} = n{w) x ^{solutions of B for p(ru)}. If furthermore 
p = 1, then p is called a parsimonious reduction. We conjecture that: 

Conjecture 1. There exists no parsimonious reduction from problem ^Sat to 
problems Min I or Min 2 - 

Nevertheless, we prove in this paper that: 

Theorem 4. There exists weakly parsimonious reductions from problem ^Sat 
to problems #MiNi and #MiN 2 . 

In regard to Conjecture 1 concerning Formula (po> h is natural to look for 
another simple EMSO formula defining a problem to which Sat (and hence any 
NP problem) parsimoniously reduces. Let Pnand denote the {/, p}-formula 

Tnand ■ 3[/ Vx ifne.ndix) where '0nand is 

ipnand ■ Ux 4=^ ~'{Ufx A U gx) Or equivalently in CNF 

Ipnand ■ {U X V U f x) A {U X V U gx) A {-dJx V -i [/ fx V -i{7 gx). 

Clearly, ifriand (resp. Pnand) implies x/^o (resp. po)- The formula Pnand defines the 
following problems: 

Notation 5. Define NanDi as the set of finite models {D,f,g) o/p„and where 
f and g are permutations of D. Define NAND 2 os the set of finite models {D, /, g) 
of Pnand whcrc f and g are permutations of D and G{D, /, g) is planar. 

In contrast to Conjecture 1, we can prove that: 

Theorem 5. (i) #Sat parsimoniously reduces to #Nandi (resp. ffNAN'D 2 ). 
(ii) If Conjecture 1 holds and P yf NP, then p„and is (up to symmetries) the 
unique minimal EMSO formula for which (i) holds, i.e., that defines a problem 
over permutation structures (D,f,g) to which #Sat parsimoniously reduces. 

Surprisingly, our proofs of completeness are rather simple and the reductions 
involved in Theorems 1 and 5 are essentially the same one reduction p : F i-A 
S{F) described in the next section. 

2 Proofs of Our Results 

2.1 The Structures Involved 

Let us recall the three kinds of instances of our problems. 

Definition 1. A function structure is a finite structure (D,f,g) where f,g : 
D — > D are unary functions. A function structure (D,f,g) is a permutation 
structure (resp. is a planar permutation structure) if f,g are permutations of D 
(resp. are permutations of D such that the graph G{D,f,g) is planar). 

Remark 1. A permutation structure (D,f,g) is naturally given by its f- and 
p-circuits, where an f -circuit of length k is an orbit a, fa, pa, ■ ■ ■ , f^a = a. 
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Definition 2 (Planar formula and Plan-Sat). Let F be a propositional 
formula in CNF. Let G{F) denote the following bipartite graph (V,E) where 
V is the disjoint union of the set of variables and the set of clauses of F , and 
E is the set of pairs {v, C) such that v is a variable that occurs in clause C. 
F is a planar formula if G{F) is a planar graph, and Plan-Sat is defined as 
the satisfiability problem of planar formulas. 

Our proofs of completeness use the NP-complete problem Plan-Sat [13]. 



2.2 A Gadget Structure 

We are going to describe a reduction p : F S{F) that associates to each 
Sat (resp. Plan-Sat) instance F a permutation structure S{F) that contains 
as substructures many occurrences of the following gadget denoted True whose 
role is essential in our reduction. 

Definition 3. True or True(a, /?, 7) is the gadget depicted on the left of Fig. 1. 

The symbolization means that the gadget True plays the role of the Boolean 
constant “true” (or “1”). More formally, the following lemma expresses that in 
any case, U{')) can and should be true whereas the value of U{g^) (reached via 
the “pending” outgoing (/-edge of 7) is free. 

Lemma 1. Let True(a,/3, 7) be a gadget included in a permutation structure 
S = {D, /, g) and U : D — > {0, 1} be a monadic predicate^. 

1. If (5, U) ^ ifo then we have U{a) = 1, U{(3) = 0 and Ufj) = 1; 

2. Conversely: if U{a) = 1, U{!3) = 0 and U(pf) = 1, then the structure 

{True,U) satisfies 'ix V'nand (and hence 'ix f:o); in other words, (/’nand(a^) 
is satisfied by each element x = independently of the value ofU{gj). 

Proof. Easy and left to the reader. □ 



2.3 Our Reduction 

Let us now construct our reduction p ■. F ^ S{F) where E is a Sat (resp. 
Plan-Sat) instance, i.e., a conjunction of clauses F = Ci A C2 A • • • A Cq. In 
the description of the permutation structure S{F), we freely make use of the 
following notation: 

Notation 6 . Whenever there exists some gadget True{a, (3, 7 ) such that g{x) = 
7 and ( 7 ( 7 ) = y, we will often write g{x) = True and g{True) = y by commodity. 

Let us now describe the /- and (/-circuits of our permutation structure S{F)\ 
• Construct a /-circuit {xl,nx},xf,nxf,---,x(~^,nx(~^,x(,nx'() for each 
variable Xi with r occurrences in F. Vertices x^, nx'j correspond to the A:**' 
occurrence of Xi in F. 

^ For convenience, we confuse truth values “true” and “false” with 0 and 1 and assim- 
ilate a monadic predicate [/ C D to its characteristic function U : D — > {0, 1}. 
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The gadget True 



The variable x . has 4 occurrences 




■ f-edge 
g-edge 




Fig. 1. The gadget True and the reduction around variable Xi and clause Cj 



• Construct a /-circuit {nCj,Cj,nCj ^ ■ ,nCj ,Cj ,nCj) of odd 

length for each clause Cj = Ai V • • • V in where the Cj and nCj are 
new elements corresponding to the “prefix” of length k of the clause Cj defined 
as pre&Xf.{Cj) = Ai V • • • V Afc; also construct the t+1 ^-circuits (nCj^, True) for 
Q <k < t using t+1 new gadgets True. 

• If the A:**' literal of Cj is the occurrence - resp. negation of the 
occurrence - of Xi, construct the (/-circuits {Cj, True) and (x^, True) - resp. 
(Cj,x'l,Trne) and (nx^,True) - using two new gadgets True. 

This completes the description of S{F) which is represented on the right of 
Fig. 1. The following lemma, that is obvious by the construction of S{F), means 
that our reduction preserves planarity. 

Lemma 2. F is a planar formula iff S{F) is a planar permutation structure. 



2.4 Properties of the Reduction 

Lemmas 3 and 4 that follow mean together that p \ F ^ S{F) is a reduction 
(resp. parsimonious reduction) from Sat to the problem defined by (resp. 
V^nand). First, the following fact whose proof is straightforward will be useful in 
our study of the /-circuits of 5(F). 

Fact 1. Let S = (D,f,g) be a permutation structure and U : D — > {0,1} be 
a monadic predicate such that {S,U) ^ Vx ifoix)- Then, for every a € D such 
that (5, U) \= U{ga) (i.e., U{ga) = 1), it holds U{a) = 1 — U{fa). 

Here is the first implication involved in the equivalence to be proved, i.e., 
5(F) \= (fio (resp. (/Jnand) iff F is satisfiable. 

Lemma 3. //5(F) satisfies ipo then F is satisfiable. 

In order to prove Lemma 3, we need the following two claims: 
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Claim 1 (Existence of a witness literal for each clause). Let U be a 

predicate such that {S{F),U) |= Vx ipo{x). For each clause Cj, there exists at 
least one literal A in Cj for which it holds: U{nx^) = 0 if X = Xi, and U{x^) = 0 
if X = -iXi, where X is the occurrence of Xi. 

Claim 2 (Coherence of the occurrences of the same variable). Let U be a 

predicate such that (S{F), U) |= Vx ipoix). For each variable Xi occurring r times, 
it holds: U(x}) = l — U{nxl) = Ui^xf) = 1 — [/(nxf) = • • • = U{x\) = 1 — U{nx\). 

We first prove Claims 1 and 2, and then deduce Lemma 3. 

Proof (of Claim 1). Assume that the claim is false. Then there is a clause Cj 
such that for each literal A, it holds U{nxb) = 1 if A = Xi and C/(xf) = 1 if 
A = -'Xi- This implies U{ga) = 1 for each element a of the /-circuit of Cj, and 
hence C/(a) = 1 — U{fa) by Fact 1, which is impossible since the length of this 
/-circuit is odd. □ 

Proof (of Claim 2). Immediate consequence of Fact 1 applied to each element a 
of the /-circuit of Xi since we always have g{a) = True and thus U{ga) = 1. □ 

Proof (of Lemma 3). Define the assignment / of the variables F as /(xj) = 
U{x^) = 1 — U{nx^), for each variable Xi and any 1 < h < r, which is coherent 
by Claim 2. Claim 1 ensures that in each clause Cj of F, there is some literal A 
such that /(A) = 1. Hence, I \= Cj and I \= F. □ 

Lemma 4 states the most precise property of our reduction p : F 

Lemma 4. There is a bijective correspondence L Uj of the set of satisfying 
assignments {L : L \= F} onto the set of monadic predicates {U : (S{F),U) ^ 
Vx 'i/'nand(a;)}- That means that p : F ‘5(F) is a parsimonious reduction from 
Sat to the problem defined by ipnand- 

Proof (of Lemma 4)- For each / such that L \= F, let us construct its associated 
monadic predicate Uj, on the domain D of S{F). The correction will be ensured 
by Claim 3 and its converse Claim 4: 

• Set Ui{a) = 1, Ui{P) = 0 and = 1 for each gadget True(o;, /3, 7 ) in 

S{F): this is justified by Lemma 1; 

• Set U[{xb) = I{xi) and Ui{nx^) = 1 — I{xi) for each variable Xi occurring 

r times in F and each 1 < < r; 

• For each clause Cj = Aj V • • • V A^, set Ui{nCj) = 1, and for A: = 1, • • • , set 
Ui{Cj) = value(preGx/,.(Cj),I), and Ui{nCj) = 1 — value(preBxf.(Cj), I), where 
preGxf,(Cj) = A] V • • • V A^ and in particular Cj = preGx^{Cj). 

In the following, we essentially use the well-known fact that all the Boolean 
connectives can be expressed by means of the NAND one only. More precisely, 
1 — V = NAND(u, 1) and OR(x, v') = NAND(1 — u, 1 — x'). 



Claim 3. (5(F), C/) ^ Vx tpnand{x). 
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Proof (of Claim 3). For each element a of the /-circuit of any variable Xi, we 
have Ui{ga) = 1 and Ui{a) = 1 — Uj{fa), and hence (S{F),Ui) \= U{a) 4=^ 
NAND(C/(/a), [/( 5 a)). For every clause Cj of length £, one easily obtains the 
following equalities for 1 < fc < t' if = Cj~^ V xf : 

• Ui{nC^) = 1 - Ui{Cf) = NAND(C//(C'j=), 1), and 

• Ui(C^) = NAND(U7(nCj=-i),C//(nx^)); 

and similarly in the case Cj = Cj~^ V -ixf. This proves {S{F),Ui) ^ tpnand{a) 
for every element a yf nCj in the /-circuit of Cj. Finally, this also holds for 
a = nC'j since Ui{nCj) = value{~'Cj , I) = 0 and, as a consequence, Ui{nCj) = 
1 = NAND ({7/ (nC|), 1) as required. This completes the proof of Claim 3. □ 

It remains to prove the converse of Claim 3. 

Claim 4. Let U he a monadic predicate such that {S{F),U) ^ Vx V’nand(a^)- 
Then there is an assignment I, of course unique, such that U = Ui and I \= F. 
Proof (of Claim 4)- It is a variant of the proof of Lemma 3 and is left to the 
reader. This completes the proof of Lemma 4. □ 

Lemmas 2, 3 and 4 together imply the following: 

Corollary 1. (i) Sat (resp. Plan-Sat^ reduces to problem Mini (resp. MiN 2 j 
by the reduction p \ F ^ S{F). (ii) #Sat (resp. /!:Plan-Sat^ parsimoniously 
reduces to problem /^NanDi (resp. /!:NanD 2 J by the same reduction. 

So, we have proved Theorems 1 and 5(i), by making use of the known result 
that /ISat parsimoniously reduces to /^Plan-Sat [13]. A careful analysis of our 
reduction p ■. F ^ S{F) from Sat (Plan-Sat) to Mini (M 1 N 2 ) shows that the 
only part of S{F) where this reduction is not parsimonious are the /-circuits of 
the clauses of F when at least two literals of some clause of F are true together. 
On the other hand, it is known that the problem |-Sat (also denoted one-in- 
three-SAT, see [ 6 ]) and its planar restriction Plan-|-Sat defined below are 
equivalent to Sat and Plan-Sat under parsimonious reductions (see [10]). 
Definition 4. Let |-Sat (resp. Plan-I-SatJ denote the satisfiability problem 
of a conjunction of ^-clauses (resp. planar ^-clauses) of the form ^(a,b,c) whose 
meaning is “exactly one of the three variables a,b,c is true”. 

Theorem 4 is a straightforward consequence of the following lemma: 
Lemma 5. #|-Sat (resp. #PLAN-|-SATj reduces to #MiNi (resp. #MiN 2 j 
under a weakly parsimonious reduction. 

Proof. Let F 1 -^ F' he the trivial parsimonious and planarity-preserving reduc- 
tion from j-Sat (resp. Plan-I-Sat) to Sat (resp. Plan-Sat) that replaces 
every I-clause |(a, 6 , c) by the logically equivalent conjunction (a V 6 V c) A 
{-•a V -■&) A(-'& V -ic) A(-ic V -•a). One notices that in each clause of this conjunc- 
tion, except one 2 -clause, e.g., C = -laV -•b, exactly one literal is true and both 
literals of C are true. Let us now consider the composed reduction p' ■. F ^ S{F') 
from I-Sat (Plan-I-Sat) to Mini (M 1 N 2 ). If F contains q ^-clauses then it 
holds #{[/ : (5(F')> U) (= Vx M^)} = 2« x #{/ : / ^ F}. 

This is easily justified by a careful analysis of the /-circuits of clauses (of F') 
in S{F'): one sees that each ^-clause of F gives exactly 2 “local configurations” 
of the (union of four) /-circuits of the four corresponding clauses of F’'. □ 
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2.5 Minimality of <^o and in Theorem 2 

We consider EMSO formulas of the form: Lp : 3U \/x ip, where U (resp. x) is a list 
of monadic relation symbols (resp. first-order variables) and ip is quantifier-free. 

Proof. There is nothing to prove about the absence of composition of functions 
and the absence of equality. We prove the minimality of: 

• the input signature (= 2 unary function symbols); a famous theorem of 
Courcelle [2], asserts that any MSO property of bounded tree- width structures 
can be checked in deterministic linear time. In particular, any EMSO property 
of cr-structures with a = {/, Cfi, • • • , C/fc} where / is a unary function symbol and 
Ui, ■ ■ ■ ,Uk are monadic relation symbols is checkable in linear time. 

• the number of EMSO symbols (= 1); immediate since any first-order (FO) 
property is ACq and thus is PTIME. 

• the number of FO symbols (= 1); trivial. 

• the number of clauses in ipo (= 2); assume Lp is an ESO formula in CNF 
with only one clause. If some ESO symbol occurs in ip then ip defines a trivial 
“yes” -problem. Otherwise, ip defines a first-order property. 

• the length of ipo (= 5); if the length of tp in CNF is < 4 then ip either: (i) 
contains only clauses of length < 2, or (ii) contains only one clause (of length 3 
or 4), or (Hi) contains exactly one 3-clause and one unit clause. In case (i), 
ip is ESO-Krom and, as a consequence, defines a PTIME problem [8]. In case 
(ii), ip defines a PTIME problem as it was noticed above. Finally, in case (Hi), 
one observes that the 3-clause either contains < 1 positive literal or contains 
< 1 negative literal. Hence, ip is either ESO-Horn or ESO- Anti-Horn, and thus 
defines a PTIME problem [8] . 

• the number of distinct atoms (= 3); if ip in CNF contains < 2 distinct 
atoms, then its clauses are trivially of length < 2, and ip is ESO-Krom. 

• the number of anticlauses in (5q (= 3); notice that any formula p in DNF 
that contains < 2 disjuncts is equivalent to a CNF formula that consists of 
clauses of length < 2. 

• the length of So in DNF (= 6); w.l.g., assume that p in DNF is of the 

form p : 3U Vx(ipo V i/'i), where ipo (resp. f/'i) is a disjunction of anticlauses in 
each of which no (resp. at least one) EMSO symbol occurs. If ipi contains a unit 
anticlause, then p defines a trivial “yes” -problem. Moreover, if the number of 
anticlauses in ipi is < 2, then p defines a PTIME problem. Thus, if p defines an 
NP-complete problem then ip\ consists of at least 3 anticlauses of length >2. □ 



2.6 Unicity up to Symmetries of po and <5o in Theorem 3 

Let us prove the unicity of po (the proof of Sg is similar). Let p be an EMSO 
formula in CNF that satisfies the conditions of the table of Thereom 2 and defines 
an NP-complete problem over functional structures (D,f,g). The list of atoms 
that occur in p is Ux, Ufx, Ugx, and p is of the form 3U \/x ip(f,g,U,x), 
where ip \s & conjunction of two clauses C\ and C 2 with jCil -I- IC 2 I = 5 and 
ICil < IC 2 I < 3. That implies \C\\ = 2 and IC 2 I = 3. 
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Proof. One notices that one clause consists of positive literals and the other one 
consists of negative literals: otherwise, ip would define a trivial “yes” -problem. 
That implies that p has one of the following two forms i^o or p'q eis defined in 
Subsection 1.2, up to permutations of / and g and swap of U and -<U'. 

Formulas and p'^ essentially define the same problem over (planar) per- 
mutation structures {D,f,g)\ By replacing x by g~^x in the matrix of the 
formula po, we immediately get (D,f,g) (= po{f,g) iff {D,f,g') ^ p'g{f,g'), 
where f' = g~^ and g' = fg~^. This also makes sense for planar permutation 
structures since G{D, /, g) is planar iff G{D, /', g') is planar. □ 

It remains to prove Theorem more precisely reformulated as follows: 
assume Conjecture 1 and P ^ NP. Then Pn&nd is (up to permutations of x, fx, 
gx and swap of U and -■[/) the unique minimal EMSO {/, (;}-formula in CNF 
of the form 3U Vx if{x) with the only atoms Ux, U fx and Ugx that defines 
a problem over permutation structures to which ffSAT parsimoniously reduces. 
More precisely, (/?nand has a minimal number of clauses (= 3), and a minimal 
length (=7). 

2.7 Minimality of y^nand in Theorem 5(ii) 

Proof. We prove the minimality of: 

• the number of clauses (= 3).' clearly, any EMSO formula p of the required 
form that defines an NP-complete problem (over permutation structures) with 
exactly two clauses has exactly one purely negative clause and one purely positive 
clause, and has at least one 3-clause and no unit clause^; so, the other one has 
length 2 or 3. This gives only two possible forms: our minimal formula pq (and 
its symmetrical variants), and (pnae defined as: 

Pna.e '■ 3[7 Vx ifnaeix) where ifnae is the “not-all-equal” formula 

ipnae ■ {U X V U fxV U gx) A {-lUx V -'ll fx V -lUgx). 

One easily sees that for any function structure S, the number ff{U : {S, U) ^ 
Vx tpnae{x)} is cvcn because ifnae is invariant by inversion of U and ~<U. So, no re- 
duction from Sat to the problem defined by Pnae (if such a polynomial reduction 
exists) can be parsimonious with the standard way of counting solutions. 

• the length (= 7).- it is a consequence of the fact that there should be at 

least three clauses of length > 2 with at least one of length 3. □ 

2.8 Unicity of y^nand in Theorem 5(ii) 

Proof. Clearly, any formula that meets our minimality conditions, i.e., that has 
three clauses and length 7, has exactly one 3-clause and two 2-clauses. Moreover: 

(i) At least one clause is purely positive and at least one is purely negative; 

(ii) No 2-clause subsumes the 3-clause; 

^ \l p contained a unit clause, then it would define either a trivial “yes” -problem or a 
trivial “no” -problem. 
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(iii) Each 2-clause must disagree with the 3-clause on the sign of every literal: 
otherwise, if we write the 3-clause as (£i V ^2 V £ 3 ), either the 2-clause is of the 
form (£i V £ 2 ) and then its subsumes the 3-clause, or the 2-clause is of the form 
(£1 V £ 2 ) and then a resolution step over £i induces the 2-clause (£2 V £ 3 ) that in 
turn subsumes the 3-clause. This contradicts (ii); 

(iv) The 2-clauses have exactly one atom in common: they clearly have at 
least one since there are only three atoms available. Now, if they have two, they 
disagree on the sign of either one literal or two literals. If we have (£3 V £ 2 ) A 
(£1 V £ 2 ), then a resolution step over £2 induces the unit clause (£ 1 ). If we have 
(£1 V £ 2 ) A (£1 V £ 2 ), then £i 4=^ £2 and the 3-clause reduces either to a 2-clause 
or to “true” by replacing £i by £ 2 ; 

(v) The 3-clause must be monotone. Otherwise, by (t), the two 2-clauses 
must be monotone of opposite sign: Let then e be the majoritary sign of the 
3-clause. The 2-clause of sign e cannot disagree on the sign of every literal with 
the 3-clause, since this latter has only one literal of sign e. This contradicts {iii)', 

(vi) Both 2-clauses are monotone, of the same sign, opposite to the sign of 
the 3-clause: This is a direct consequence of {iii) and (u). 

Clearly, Remarks {iv), {v) and {vi) together leave exactly f/'nand and its sym- 
metrical variants as the only candidates. □ 



3 Conclusion and Open Problems 

Exhibiting “the” minimal EMSO formula that defines an NP-complete problem 
over functional structures is the main contribution of this paper. The “mini- 
mality” is also strengthened by the fact that this main result also holds when 
restricted to permutation structures or even to planar permutation structures 
which seem to be the simplest functional structures. A striking point is the 
unicity (up to symmetries) of our formula. More precisely, we have seen that 
all the symmetrical forms of our minimal formula essentially define only two 
distinct NP-complete problems over functional structures (see formulas (po and 
ifQ in Section 2.6) and only one such problem over permutation (resp. planar 
permutation) structures. This delineates a very neat frontier in logic between 
NP-complete problems and tractable ones. Several open problems remain: 

The first one is the analogous minimality question over relational structures. 
The second one is Conjecture 1 and its analogue for function structures: is there 
a parsimonious reduction from #Sat to ^MiNq? A difficulty in counting com- 
plexity is to define a relevant notion of reduction. Recently, Durand et al [3] 
have defined an interesting reduction, callled subtractive reduction, under which 
=ffP and other counting complexity classes are closed and have significant com- 
plete problems. If positively answered, the following question may be easier and 
more relevant than Conjecture 1: is there a subtractive reduction from ^Sat to 
#MiNi and (i.e., are the latter ^^(^P-complete under such reductions)? 

Another interesting objective consists in looking for a necessary and sufficient 
decidable condition for which any EMSO formula of the form 3U \/x V'(^) f^x) 
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and of unary signature / expresses an NP-complete problem over /-structures 
(resp. over permutation /-structures, or over planar permutation /-structures.) 

Finally, does the EMSO formula <p„ae of subsection 2.7 define a PTIME or 
NP-complete problem over permutation structures? Notice that (/?nae defines a 
PTIME problem over planar permutation structures since the problem Nae-Sat 
is PTIME for planar instances [11]. 



Acknowledgments. The authors thank the referees for their helpful comments 
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Abstract. Given four distinct vertices si,S2,ti, and t 2 of a graph G, 
the 2-disjoint paths problem is to determine two disjoint paths, pi from 
Si to ti and p 2 from S 2 to t 2 , if such paths exist. Disjoint can mean 
vertex- or edge-disjoint. 

Both, the edge- and the vertex-disjoint version of the problem, are J\fP- 
hard in the case of directed graphs. For undirected graphs, we show that 
the 0(mn)-time algorithm of Shiloach can be modihed so as to solve 
the 2-(vertex-)disjoint paths problem in 0{n + ma{m,n)) time, where 
m is the number of edges in G, n is the number of vertices in G, and a 
denotes the inverse of the Ackermann function. Our result also improves 
the running time for the 2-edge-disjoint paths problem on undirected 
graphs as well as the running times for the decision versions of the 2- 
vertex- and the 2-edge-disjoint paths problem on dags. 



1 Introduction 

In the fc-disjoint paths problem {k G IN) we have 2k pairwise distinct vertices 
si, . . . , Sfe, ti, . . . , tfc of a graph G = (V, E), and we want to output k pairwise 
disjoint paths pi from s* to ti (1 < f < k), if such paths exist. For short, we will 
subsequently refer to the fc-disjoint paths problem as fc-DPP or, more precisely, 
as fc-VDPP if disjoint means vertex-disjoint, and as fc-EDPP if disjoint means 
edge-disjoint. For time bounds, we let n = \V\ and m = \E\. 

The /c-DPP arises in the context of VLSI-design, routing problems, and net- 
work reliability (see [1] and [15]) and has been extensively studied. A short 
overview is given in this introduction. Further overviews can be found in [2], 
[21], and [22]. We will also consider the decision version of the /c-DPP in which 
we only want to test the existence of k disjoint paths pi from Sj to U {1 < i < k). 
Given an 0(T(n, m))-time algorithm for the decision version of the fc-DPP, the 
disjoint paths, if they exist, can be computed within 0{n+mT{m, n)) time using 
the following algorithm: Step through all edges of the input graph G, and, for 
each edge e considered, if there are k pairwise disjoint paths connecting Si to ti 
for 1 < f < /c in the graph G — {e},^ delete e from G before considering the next 

^ For short, given a graph G = (Vj E) and a set IF C F we define G — IF to be the 
graph (F — IF, A n {{«, v}\u,v € F — IF}), and, similarly, for each set F G E we 
let G — F be the graph (F, E — F). 
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edge. After having considered all edges the resulting graph consists of exactly k 
disjoint paths connecting Si to for 1 < t < k. The paths themselves can be 
output with a depth-first search in the case of the fc-VDPP. In the case of the 
fc-EDPP the construction of the paths is more complicated. It can be done in 
0(n + niT{n, m)) time, but we will not show this here. 

Previous results. For directed graphs, the decision versions of the fc-EDPP 
and the /e-VDPP are AfP-complete, even for k = 2, as shown by Fortune, 
Hopcroft, and Wyllie [5]. However, in [17] Perl and Shiloach presented an 0{mn)- 
time algorithm for solving the 2-VDPP on dags (directed acyclic graphs). For- 
tune, Hopcroft, and Wyllie [5] generalized this result of Perl and Shiloach to a 
polynomial-time algorithm for the fc-VDPP on dags. Lucchesi and Giglio [13] 
gave a linear-time reduction from the decision version of the 2-VDPP on dags 
to the 2-VDPP on undirected graphs. Solving the latter problem with the pre- 
viously best known algorithm for the 2-VDPP on undirected graphs leads to an 
improved running time of O(n^) for the decision version of the 2-VDPP on dags. 
This bound holds also for the decision version of the 2-EDPP as we will show in 
Sect. 3 of this paper by using an 0{n + m log 2 _|_m/n u.)-time reduction from the 
2-EDPP to the 2-VDPP. As an application Schrijver [20] described an air plane 
routing problem that can be solved with an algorithm for the /c-EDPP on dags. 

If k is not fixed, as defined above, but part of the input, the problem of testing 
whether there are k disjoint paths pi from Si to ti {1 < i < k) is AfP-complete 
also for undirected graphs. This was shown by Knuth, cf. [10], and Lynch [14] 
for the VDPP and by Even, Itai, and Shamir [4] for the EDPP. 

The first polynomial-time algorithms for the 2-DPP on undirected graphs 
were given by Ohtsuki [15], Seymour [23], Shiloach [24], and Thomassen [26]. 
More precisely, Seymour gave only a solution for the decision version of the 
2-DPP, but for both, the 2-EDPP and the 2-VDPP. Ohtsuki, Shiloach, and 
Thomassen considered only the 2-VDPP. However, with the reductions described 
in [17] their algorithms can also be used for solving the 2-EDPP without increas- 
ing the running time of 0{nm) of Ohtsuki’s and Shiloach’s algorithms. Later, 
Khuller, Mitchell, and Vazirani implicitly showed in [11] that the algorithm of 
Shiloach can be modified so as to run in O(n^) time. Using an appropriate re- 
duction, one can also show that the 2-EDPP can be solved within the same time 
bound (see Sect. 5 for an example of such a reduction). Finally, in [7] Gustedt de- 
scribed an 0{n + mlogn)-time algorithm for the 2-VDPP on undirected graphs. 
Unfortunately, since some of the lemmas in [7] fail for certain types of graphs, 
the current version of Gustedt ’s algorithm does not work on all graphs. But, 
for all instances of the 2-VDPP for which G is a triconnected graph such that 
there is no vertex v of G with v ^ {si, S 2 , ti, t 2 } that can be separated from 
{si, S 2 , ti, ^ 2 } by a separator of size three, the lemmas of [7] mentioned above 
hold and, as a byproduct of this paper, we prove that the 2-VDPP can be re- 
duced to the 2-VDPP on this restricted set of graphs. Goncerning the more 
general fc-DPP, Robertson and Seymour [18] showed that the decision version of 
the undirected fc-VDPP is solvable in O(n^) time. Finally, Perkovic and Reed 
[16] improved the running time to O(n^), which is currently the best known time 
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bound for the decision version of the /c-DPP on undirected graphs, for all fc > 3. 
For some kinds of graphs, the 2-DPP or the general fc-DPP are solvable in linear 
time (see [17], [6], [12], and [19]). 

New results. In this paper we present an algorithm for solving the 2-VDPP 
on undirected graphs in 0{n + ma{m,n)) time, which improves the previously 
best known time bound of 0{n^) to 0(min{n^, n + ma(m, n)}). a denotes the 
inverse of the Ackermann function. Applying some well-known reductions or 
slightly modified versions thereof, we will also show that our result also leads to 
an 0{ma{m,n) -l-nlogn) time bound for the 2-EDPP on undirected graphs, an 
0{n + ma{m,n)) time bound for the decision version of the 2-VDPP on dags, 
and finally an 0{n + mlog 2 _|_m/„ n) time bound for the decision version of the 
2-EDPP on dags. 

The paper is organized as follows. In Sect. 2 we discuss Shiloach’s algorithm 
for the 2-VDPP on undirected graphs. We will see that the 2-VDPP can be re- 
duced to a problem that basically consists in eliminating, without changing the 
solvability of the problem, all vertices of an instance of the 2-VDPP that can 
be separated from si,S 2 ,ti, and t 2 by a separator consisting of three or fewer 
vertices. Following the presentation, in Sect. 3, of some simple definitions con- 
cerning the fc- vertex-connectivity, the problem above is solved in Sect. 4. Other 
versions of the 2-DPP are discussed in Sect. 5. 

2 Shiloach’s Algorithm 

In this section we consider Shiloach’s algorithm for the 2-VDPP on undirected 
graphs. For an instance I = (0,51,52,^1,0) of the 2-VDPP, if there exist two 
vertex-disjoint paths pi from 5i to t\ and p 2 from 52 to ^ 2 , we say that I has a 
solution and that pi and p 2 solve I. Given two instances Ii = (G, 5i, 52 , 0, 0) 
with G = (V,E) and I 2 = (G', 5(, 52 , t'l, with G' = {V',E') of the 2-VDPP, 
let us say that Ii is 2-paths reducible (or for short 2P-reducihle) to I 2 if the 
following conditions hold: First, I\ has a solution iff I 2 has a solution; sec- 
ond, \V'\ = 0(|V|); third, \E'\ = 0(|if|); and finally, fourth, given a solution 
of I 2 we can solve I\ in 0(|P| -I- |E|) time. If we replace an instance I\ of 
the 2-VDPP by another instance I 2 such that I\ is 2P-reducible to I 2 we also 
say that I\ is 2P-reduced to I 2 ■ 

In [24] Shiloach outlined a proof of Itai according to which we can assume 
w.l.o.g. that the input graph G = {V,E) of an instance / = (G, 5i, 52 , G, ^ 2 ) of 
the 2-VDPP is triconnected. On a triconnected graph the algorithm of Shiloach 
proceeds as follows: If G is planar, the problem instance I is solved with the 
algorithm of Perl and Shiloach [17]. Otherwise I is 2P-reduced to an instance 
/' = (G', 5i, 52, GG 2 ) of the 2-VDPP with G' = {V',E') a triconnected graph 
such that there are four vertex-disjoint paths from 5i, S 2 , G? and t 2 to any other 
set S CV' — { 51 , 52, ti, ^ 2 } with at most four vertices (if [S'] < 4 the end points 
of the paths in S may overlap). According to Shiloach, the 2-VDPP on such an 
instance /' = (G', 5i, S 2 , G, G) can be solved as follows: First extract a subgraph 
of G' homeomorphic to or 7 ^ 3 , 3 . If no such subgraphs exist, G' is planar and 
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/' can be solved with the algorithm of Perl and Shiloach [17]. Otherwise, if there 
is a subgraph homeomorphic to K^, two disjoint paths from si to ti and from 
S 2 to t 2 can be found with an algorithm of Watkins [27]. For the remaining case, 
Shiloach proved that using some further reductions two disjoint paths from si 
to ti and from S 2 to t 2 can be constructed in linear time. 

Concerning the complexity of the algorithm above, Shiloach showed that 
nearly all steps of the algorithm run in 0{m + n) time. Only two bottlenecks 
were not solved in 0{m + ri) time by Shiloach: The first one is the reduction 
from / to and the second one is the extraction of a subgraph homeomorphic 
to 7^3,3 or K 5 . But Williamson [28] gave a linear-time algorithm for the latter 
problem. Thus, for solving the 2-VDPP in 0{n + ma{m, n)) time, we only need 
to show that, given an instance I = (G, si, S 2 , ti, ^ 2 ) of the 2-VDPP, where G is a 
triconnected graph, it is possible to construct, in 0 {ma{m, n)) time, an instance 
I' = (G', si, S 2 , ti, ^ 2 ) of the 2-VDPP, where G' = {V',E') is a triconnected 
graph such that in G' there are four disjoint paths from si, S 2 , G, and t 2 to any 
other set S CV' — {si, S 2 , ti, ^ 2 } with [S'] < 4, and I is 2P-reducible to This 
is exactly what we will do in Sect. 4. 

3 fc- Vertex-Connectivity — Facts and Definitions 

The following facts and definitions related to /c-vertex-connectivity will be useful 
for understanding the correctness of our algorithm for the 2-VDPP in Sect. 4. 
Let G = (V, if) be an undirected graph and let s, t G V with s ^ t. We say that s 
and t are k-vertex-connected (or that s is /c- vertex-connected to t), iff there are k 
internally vertex-disjoint paths from s to t. By internally vertex-disjoint we mean 
that every pair of the k disjoint paths has no vertex in common, except s and 
t. We say that G is k-vertex-connected if all pairs of vertices of G are fc-vertex- 
connected. If two distinct vertices s and t of a graph G = (V, E) with {s, t} ^ E 
are not (fc -|- l)-vertex-connected, they can be separated by a fc-separator: 

Definition 1. Let G = (V, E) he an undirected graph. Then we call a subset 
S C V with \S\ = k a (k-)separator (of G) iff the graph G* = G — S is not 
connected. If two vertices s,t of G* are not connected in G* , we say that S 
separates s and t (in G). If, for a vertex s and a set T C V, the connected 
component of G* containing s does not contain any vertex of T, we say that s 
can he separated from T hy S (in G) or that S separates s from T (in G). 

For an undirected graph G = (V, E) and two distinct vertices s,t G V, let us 
define k{s, t) as the size of a smallest separator S (containing neither s nor f) that 
separates s and t if {s, t} ^ E, and as one plus the size of a smallest separator S 
(containing neither s nor t) that separates s and t in G — {{s,t}} if {s,t} G E. 
Then there is an alternative characterization of fc- vertex-connectivity: 

Lemma 2. Two vertices s and t of an undirected graph G = (V, E) are k-vertex- 
connected iff K{s,t) > k. G is k-vertex-connected iff K{s,t) > k for all s,t GV. 



One can also show: 
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Fig. 1. A graph G and the Z\-replacement G' of G by the triangular cut S = {a, b, c}. 
The dashed lines represent newly inserted edges. 



Lemma 3. Let S he a k-separator that separates two vertices v and w of an 
undirected graph G and let p\,p 2 , ■ ■ ■ ,Pk he k internally vertex- disjoint paths 
from V to w. Then, every path pi with 1 < i < k contains exactly one vertex of 

S. 



In the following, we use “biconnected” and “triconnected” as synonyms for 
“2- vertex-connected” and “3- vertex-connected” , respectively. Moreover, queries, 
referred to as fc-connectivity queries, ask whether two given vertices v and w are 
/c- vertex-connected. Finally, in the following section, disjoint will always mean 
vertex-disjoint and fc-connected will mean fc- vertex-connected. 



4 An 0{n + mo:(m, n))-Time Algorithm for the 2-VDPP 

As we have already seen in Sect. 2, for solving the 2-VDPP in 0{n ma{m, n)) 
time on undirected graphs, we only need to show that, given an instance I = 
(G,Si,S 2 ,ti,t 2 ) of the 2-VDPP, where G is a triconnected graph, it is possible 
to construct, in 0{ma{m,n)) time, an instance I' = (G', si, S 2 , h, ^ 2 ) of the 
2-VDPP, where G' = {V ,E') is a triconnected graph such that there are four 
disjoint paths from si, S 2 , G, and ^2 to any other set S C P' — {si, S 2 , ti, ^ 2 } with 
at most four vertices (except that the end points in S may overlap), and / is 
2P-reducible to I' . For this we will make use of so-called triangle-replacements: 
Suppose we are given an instance I = (G, si, S 2 , Gj G) of the 2-VDPP such 
that there exists a vertex v in G that is separated from {si,S 2 ,G 7 G} by a 3- 
separator S = {a, 6, c}. Then, like Gustedt in [7], we call S a triangular cut of 
G and the graph G' obtained from G by deleting all vertices of the connected 
component of G — S' containing v together with their adjacent edges and by 
inserting edges between all pairs of vertices of S that are not already connected 
by an edge the triangle-replacement of G hy S (removing vertex v) or, for short, 
the A-replacement ofG hy S (see Fig. 1). Sometimes we also call the replacement 
step that replaces G by G' a ^replacement. 
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Shiloach [24] showed that if G contains no triangular cut separating a vertex 
V from {si, S 2 ,ti,t 2 }, then there are four disjoint paths from and t 2 

to any other set S C V — {51,527^17^2} with at most four vertices. Thus, if for 
an instance I = (G, si, 52, ^1,^2) the graph G does not contain a triangular cut, 
we already know that we can solve the 2-VDPP in linear time. Otherwise, we 
just eliminate all triangular cuts from G. More precisely, we repeatedly search 
for a vertex v of G that is separated from {51,52,^17^2} by a triangular cut S 
and replace I by I* = (G*, 5 i, 52, ti, ^2), where G* is the Z\-replacement of G 
by S removing v. We stop if, for the resulting instance I' = (G', si, 52, G, t2) 
after all Z\-replacements, G' has no triangular cut S separating a vertex v from 

{51, 52 , ^ 1 , 0 }- 

One can easily show that the Z\-replacements do not change the solvability 
of the problem (see [24] for example) . 

Lemma 4 . Let I = (G, 5 i, S2, G, G) be instance of the 2-VDPP such that G 
is a triconnected graph and let G* he a A-replacement of G. Then G* is also a 
triconnected graph and I* = (G*, 5 i, 52, ^1,^2) has a solution iff I has a solution. 

Moreover, Shiloach [24] showed that if / is our original instance before the 
first Z\-replacement and I' is the instance of the 2-VDPP after the last A- 
replacement, then the following holds: 

Lemma 5 . I is 2P-reducible to I'. In particular, given two paths p'^ and p '2 that 
solve I' , one can construct, in linear time, two paths p\ and p 2 that solve I. 

Concerning the running time of the reduction from / to I' one can show: 

Lemma 6 . Given an instance I = (G, 5 i, 52, fi, G) of the 2-VDPP, where G = 
{V, E) is triconnected graph, a triangular cut S of G, and a vertex v € V that is 
separated from {si,52,G,G} by S, the A-replacement G* = (V*,E*) of G by S 
removing v can he constructed in 0{\E — E*\) time. 

Proof Just start a depth-first search on G—S with v as the source node and stop 
after the depth-first search tree T with root v is complete. Then G* is the graph 
obtained from G by deleting all vertices of T and all their adjacent edges in G 
and by inserting a constant number of edges between vertices of S, if necessary. 
Since the set of vertices of T is a subset oi V — V* , it is easy to see that the 
construction of T and the deletion of the vertices and edges from G can be done 
in 0(jif — if*]) time. □ 

From Lemma 6 we can conclude that, beside the time needed for finding 
a vertex v that can be separated from { 5 i,S 2 ,G,G} by a triangular cut and 
the time needed for finding such a triangular cut, the construction of our final 
instance I' without any triangular cut takes linear time. Thus, we just need to 
search for an efficient algorithm for determining a vertex v and a triangular cut 
S such that v is separated from {51, 52, ^1,^2} by S. The main difference between 
the algorithm of this paper and Shiloach’s algorithm is the computation of such 
vertices and triangular cuts. While not searching explicitly for triangular cuts 
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Shiloach’s algorithm is occasionally unable to proceed due to the presence of such 
a cut. The cut is then removed by a Z\-replacement. After the Z\-replacement 
Shiloach’s algorithm is restartet on the resulting graph. Since this my happen 
0 (n) times, the running time of Shiloach’s algorithm is bounded by only 0 (mn). 

Here, in contrast, we systematically remove all triangular cuts separating 
a vertex v from {si, S2, ti, t2} efficiently. For identifying such a vertex v, the 
following lemma will be very useful. 

Lemma 7. Let G = (V, E) he an undirected triconnected graph containing four 
vertices Sl,S2,tl^ o,nd t2, and let = (Vx,E^) be the graph that is obtained 
from G by adding a new vertex x to V and new edges {x, si}, {x, S2}, {x,ti}, 
and {x,t2} to E. Then, a vertex v G V — {si, S2, ^i, ^2} can be separated from 
{si, S2, ti, ^2} by a set S CV with jS”! < 3 in G iff S separates v and x in Gx- 

Proof. Every path from u to a; in Gx must visit a vertex w G {si,S2,ti,t2} before 
reaching x. Hence, if S' is a 3 -separator separating v from {si, S2, ti, ^2} in G, 
every path from v to x must visit a vertex in S before (or exactly when) visiting 
a vertex in {si, S2j ^2}- Thus, S separates v and x in Gx- 

Conversely, if S is a 3 -separator separating x and v in Gx, then the vertices of 
{si, S2, ^1,^2} are contained either in S or in the connected component of G^, — S 
containing x. It follows that the connected component of Gx — S containing v 
does not contain any vertex of {si, S2, ti, ^2} and that the same is true of the 
graph G — S. Hence, v is separated from {si, S2, ^1,^2} in G by S. □ 

Now, for an instance I = (G, Si, S2, G, t2) of the 2 -VDPP, like in Lemma 7 , 
let Gx be the graph obtained from G, by adding a new vertex x and edges 
{x, si}, {x, S2}, {x,ti}, and {x,t2} to G. Then, if G is triconnected, the same 
is true of Gx, and s\, S2,t\,t2 are 4 -connected to x. Lemma 7 implies that in 
each reduction step, replacing an instance I = (G, si, S2, ti, ^2) by an instance 
I* = (G*, Si, S2, ti, G) such that G* is a Z\-replacement of G, for identifying a 
vertex v ^ {si, S2, G, G} that can be separated from {si, S2, G, G} by a triangular 
cut, we only need to look for a vertex v that is not 4 -connected to x in Gx (note 
that V is not adjacent to x). To find such a vertex we could step through all 
vertices of V and test, for each vertex, whether it is 4 -connected to x (we will 
later see efficient implementations for answering 4 -connectivity queries). But, if 
we recompute the set of all vertices that are not 4 -connected to x after each 
update of G, we might have to answer up to G(n^) connectivity queries for the 
whole sequence of all reduction steps. The following lemma will help us to reduce 
the number of connectivity queries. 

Lemma 8. Let L = (G, si, S 2 ,G,^ 2 ) be an instance of the 2 -VDPP such that 
G = {V,E) is a triconnected graph, let S be a triangular cut of G, let v be a 
vertex of G that is separated from {si,S2jGjG} by S, and, finally, let G* be the 
A-replacement of G by S removing v. Then, if two distinct vertices a G V* — S 
and b G V* are 4 -connected in G, they are also 4 -connected in G* . 

Proof. Let a G V* — S, b G V* he two distinct vertices that are 4 -connected in 
G, and let pi,p2,ps, and p4 be pairwise internally vertex-disjoint simple paths 
connecting a and b. 
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If & G V* — S, only one path p out of our four paths can visit a vertex in 

V — V*, and such a path must visit at least two vertices of S. Let us define c to 
be the first vertex on p in S, and d to be the last vertex on p in S. Now, if we 
replace the sub-path of p from c to d by the edge {c,d}, p and the other three 
paths of pi,p 2 ,P 3 , and p 4 are four internally vertex-disjoint paths in G*. 

If 6 G S', no more than two of the paths pi,p 2 ,P 3 , and p 4 can visit a vertex in 

V — V*. If they do so, they also visit a vertex in S — {6}. Then we follow these 

paths up to a first node c G S — {6}, and then use the edge {c, b} to reach b. 
After this replacement, the four paths are pairwise internally vertex-disjoint in 
G*. □ 

Hence, having replaced G by a new graph G*, if we search for a vertex v 
that is not 4-connected to x, we can exclude all vertices for which, in a previous 
reduction step, we have already tested whether they are 4-connected to x. If such 
a vertex was 4-connected to x, it remains 4-connected to x, and, otherwise, it was 
deleted from G. Thus, the number of 4-connectivity-queries is reduced to 0(n). 

For efficiently supporting 4-connectivity queries, we might consider using a 
dynamic data structure. This is a data structure that supports two kinds of 
operations: update operations, which in case of graph problems usually con- 
sist of edge insertions and edge deletions, and queries, which in our case will 
be 4-connectivity queries. The idea behind dynamic data structures for graph 
problems is that, using the knowledge about a graph before an edge insertion or 
edge deletion, possibly queries can be answered faster than without such knowl- 
edge. Unfortunately, there is no known dynamic data structure supporting all 
of the operations above in nearly constant time. However, Kanevsky, Tamassia, 
Di Battista, and Chen [9] presented a very efficient incremental dynamic data 
structure supporting only edge insertions and 4-connectivity queries. This data 
structure can be initialized, in 0(ma(m, n)) time, with a triconnected graph con- 
taining m edges and n vertices, and, after this initialization, it supports 0(m) 
queries and insertions in 0{ma{m,n)) time. With a simple trick we can make 
use of this data structure: In addition to Gx, we also maintain a graph Hx- that, 
before the first Z\-replacement, is initialized with a copy of Gx- In a reduction 
step replacing Gx by a Z\-replacement G* of G, we do not delete any vertex or 
edge of Hx, but insert in Hx the same edges that are inserted in Gx- Now, if we 
want to test whether a vertex of Gx is 4-connected to x in Gx, we only need to 
ask whether it is 4-connected to x in Hx, as shown by Lemma 9. Queries in Hx 
can now be answered with the data structure of Kanevsky et al. . 

Lemma 9. A vertex w of Gx is 4-connected to x in Gx iff it is 4-connected to 
x in Hx- 

Proof. Let us define a set of fc-paths p\,p 2 , . . . ,Pk to be quasi internally disjoint 
(or, for short, q.i. disjoint) if, for all pairs {i,j) with i,j G {1, . . . ,k} and i yf j, 
no inner vertex of pi appears on pj. We first show that, before and after each 
reduction step - i.e. Z\-replacement of Gx - transforming Gx = (Vg,,, and 
Hx = into new graphs, the following invariant holds: If, in Hx, 

there are k q.i. disjoint simple paths Pi {1 < i < k) connecting x to a vertex 
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Vi G Vh:c ^ ^GxJ where V\,V 2 , ■ ■ ■ ,Vk need not necessarily be distinct, there are 
also k q.i. disjoint paths Qi {1 < i < k) connecting x to Vi in such that the set 
of the vertices visited by qi is a subset of the vertices visited by Pi. The invariant 
holds before the first reduction step, since is initialized with Gx- Let us now 
assume that, for an I G IN, after I reduction steps, there are k q.i. disjoint paths 
Pi < i < k) in Hx connecting a; to a vertex Vi G Vh^ Moreover, let S be 

the triangular cut used for the first Z\-replacement of our algorithm, i.e. in the 
first reduction step of our algorithm. If one path p of the paths pi, . . . ,pk uses 
an edge {a, b} that is deleted from Gx after the first Z\-replacement, this path 
must also visit a vertex c € S before following edge {a, b} and a vertex d € S 
after having reached edge {a, b}. We can now replace the sub-path of p between 
c and d by edge {c,d}. It is easy to see that, after this replacement, the paths 
Pi, . . . ,Pk remain pairwise q.i. disjoint. We eventually must repeat this step to 
obtain k pairwise q.i. disjoint paths p'^, . . . ,p'^ such that no path uses an edge 
that is deleted from Gx after the first Z\-replacement, and such that the vertices 
visited by pi are a subset of the vertices visited by pi . In the same way, we can 
replace all edges of any path in p'i,...,p'f. that are deleted from Gx after the 
second, third, . . . Z\-replacement until the resulting paths use only edges that 
are not deleted from Gx after I reduction steps. 

The invariant above implies that, if, after an arbitrary number of reduction 
steps, X is 4- vertex-connected to a vertex v G H Vh^ in Hx, the same is true 
in Gx- The reverse direction is trivial. □ 

Given a vertex v of Gx that is not 4-connected to x, we still have to show how 
we can determine a 3-separator S that separates v and x. Once again, instead 
of searching for a 3-separator in Gx, we search for a 3-separator in Hx'- 

Lemma 10. Let v he a vertex of Gx and let S he a 3-separator separating v and 
x in Hx- Then S is also a 3-separator separating v and x in Gx- 

Proof- Assume that the lemma above is not true. Let us consider the first replace- 
ment of Gx = (Vbx,L^Gx) by a graph G* = {Vg-,Ec*), and of Hx = {Vh,,EhJ 
by a graph Hf = {Vh* , Eh* ) such that the assertion of the lemma holds before, 
but not after, the replacement. It is clear that every 3-separator T C Vq* in iJ* 
that separates a vertex w G Vq* and x also separates w and x in G*, since G* 
is a subgraph of Hf. Thus, if the lemma does not hold for G* and Hf, there 
must be a 3-separator T with T % Vq* that separates a vertex w G Vq* and x in 
Hf. Since Hf is triconnected, there are three pairwise internally vertex-disjoint 
paths from x to w in Hf. Then, as shown in the proof of Lemma 9, there also 
exist three pairwise internally vertex-disjoint paths q\,q 2 , and q-^ from x to w in 
Hf that do not visit any vertex outside GJ. Hence, at least one vertex of T is 
not visited by qi,q 2 , and q^- But this contradicts Lemma 3. □ 

For determining a 3-separator of Hx separating a vertex v and x, we can again 
use the data structure of Kanevsky et. al.. This data structure also maintains, 
in 0{a{m,n)) amortized time per edge insertion, a special decomposition tree 
from which, for any given pair of two non-4-connected vertices u and w, one can 
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construct, in constant time, two sets Si and S 2 such that one of these sets is a 
3-separator separating u and w (see [9] for more details). Now, for two sets Si 
and S 2 with one of them being a 3-separator separating v and x, we start two 
interleaved depth-first searches on Gx — Si and on Gx — S 2 with v as the source 
node; and we continue until one of the two depth-first searches has completed 
its depth-first search tree containing vertex v without having visited vertex x. 
Depending on whether this happens to the depth-first search on Gx — Si or to 
that on Gx — S 2 , either (in the first case) or S 2 (in the second case) is a 3- 
separator separating v and x. Since the running time of the subroutine above is 
dominated by that of the depth-first search that detects a 3-separator S = Si or 
S = S 2 , and, after identifying S, all vertices and edges visited by this depth-first 
search will be deleted from Gx in order to complete the reduction step replacing 
Gx by the Z\-replacement G* of G by S, the extra running time for determining 
3-separators of two possible candidates can be bounded by 0{m), taken over all 
reduction steps. 

Let us now analyze the complexity of the whole algorithm. For answering 
all 4-connectivity queries and identifying 3-separators separating a vertex v and 
X, we need only 0{ma{m,n)) time using the data structure of Kanevsky et. 
ah, since this data structure can be initialized in time and our 

algorithm consists of only 0{n) edge insertions and 4-connectivity queries (note 
that n < m, since G is triconnected) . Since we have already shown that the 
remaining parts of our algorithm for identifying triangular cuts run in linear 
time, we can conclude that the following lemma holds: 

Theorem 11. Let I = (G, si, S 2 , ti, ^ 2 ) he an instance of the 2-VDPP, where 
G = (V,E) is an undirected triconnected graph. Then, in 0{ma{m,n)) time, an 
instance I' = (G', Si, S 2 , G, ^ 2 ) of the 2-VDPP can he constructed such that I 
is 2P-reducihle to and such that G' = {V',E') is an undirected triconnected 
graph not containing any vertex v ^ {si, S 2 , ti, ^ 2 } that can he separated from 
{si, S 2 , ti, ^ 2 } hy a triangular cut. 

With the results of Sect. 2 we can conclude: 

Theorem 12. The 2-VDPP can he solved in 0{n -\- ma{m,n)) time. 

5 Extensions 

Perl and Shiloach [17] showed that the 2-EDPP on undirected graphs can be 
reduced to the 2-VDPP on undirected graphs as follows: If, for an instance 
(G, si,S 2 ,ti,t 2 ) of the 2-EDPP, there are two vertex-disjoint paths from si to G 
and from S 2 to G, just output two such paths with an algorithm for the 2-VDPP. 
Otherwise, add a new vertex x and edges {x, Si}, {a;, 52 }: {x, G} and {x, ^ 2 } to G. 
Then, there are two edge- (but not vertex- (disjoint paths from si to ti and from 
S 2 to G, if and only if there is a vertex u of G that is 4-edge-connected to x. Given 
such a vertex u, one can use network-flow techniques to determine, in 0{m) time, 
four edge-disjoint paths from u to si,S 2 ,G, and G, and by concatenating two 
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of these paths it is easy to construct, in linear time, two edge-disjoint paths 
from Si to ti and S 2 to ^ 2 - In [17] the problem of determining a vertex u that is 
4-edge-connected to x was solved by n applications of network-flow techniques, 
which yields a running time of 0{mn). But, as shown by Dinitz and Westbrook 
[3], a sequence of q 4-edge-connectivity queries in an undirected graph with n 
vertices and m edges can be answered in 0{q + m + nlogn) total time. Hence, 
the 2-EDPP on undirected graphs can be solved in 0{ma{m,n) + nlogn) time. 

In [13] Lucchesi and Giglio presented a linear-time algorithm that, given an 
instance I = (G, si, S 2 , ti, ^ 2 ) of the decision version of the 2-VDPP on dags 
with G = (V,E), constructs an instance /' = (G', si, S 2 , ti, ^ 2 ) of the decision 
version of the 2-VDPP for undirected graphs with G' = {V',E') such that 
\E'\ < \E\, \V'\ < \V\, and there are two disjoint paths from si to ti and from 
S 2 to ^2 in G iff the same is true of G'. Thus, there is an 0{ma{m,n) + n)- 
time algorithm for solving the decision version of the 2-VDPP on dags. Finally, 
similarly to the reduction of the 2-EDPP to the 2-VDPP on undirected graphs, 
for the 2-EDPP on a dag, either two disjoint paths can be found with an al- 
gorithm for the 2-VDPP, or we add two vertices x and y and directed edges 
(a;, Si), (x, S 2 ), (G, y), and (t 2 ,y) to our input graph. In the latter case, there 
are two edge- (but not vertex- (disjoint paths, pi from si to G and P 2 from S 2 
to G, iff there is a vertex u such that there are two edge-disjoint paths lead- 
ing from X to u as well as two edge-disjoint paths from u to y. u, if it exists, 
can be determined in 0(n -I- m log 2 +m/n time with a data structure of Su- 
urballe and Tarjan [25]. Hence, the decision version of the 2-EDPP on dags 
is solvable in 0(n -I- m log 2 _|_,„/„ n) time. 
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Abstract. Meyer as well as Goldberg recently described algorithms that 
solve the single-source shortest-paths problem in linear average time on 
graphs with random edge lengths drawn from the uniform distribution 
on [0,1]. This note points out that the same result can be obtained 
through simple combinations of standard data structures and with a 
trivial probabilistic analysis. 



1 Introduction 

The classic single-source shortest-paths (SSSP) problem asks, given a network 
Af with real-valued edge lengths and a distinguished vertex s in Af, called the 
source, for shortest paths in Af from s to all vertices in Af for which such short- 
est paths exist. This paper considers the important special case of the SSSP 
problem in which all edge lengths are nonnegative, and which can be solved 
with Dijkstra’s algorithm [2,3,10]. The running time of Dijkstra’s algorithm de- 
pends on the implementation of a priority-queue data structure used by the 
algorithm. Realizing the priority queue by means of a Fibonacci heap, Fredman 
and Tarjan [4] achieved a time bound of 0{m -\- nlogn) for input networks with 
n vertices and m edges. Faster algorithms, some of them randomized, are known 
for the case of integer edge lengths (see [9]). None of these algorithms guaran- 
tees a linear (expected) running time in all circumstances, however, for which 
reason some researchers left the worst-case scenario and tried to obtain good 
average performance on networks drawn at random. Meyer [7,8] showed that on 
arbitrary directed graphs equipped with random edge lengths drawn indepen- 
dently from the uniform distribution on the interval [0,1], the SSSP problem 
can be solved in linear average time; moreover, a linear time bound holds with 
high probability. Meyer’s algorithm is somewhat involved, and his probabilistic 
analysis is complicated. Subsequently Goldberg [5] gave a simpler analysis of 
an algorithm with the same properties and observed that independence of the 
edge weights is not needed for a linear average time bound. This note points out 
that essentially the same result can be obtained through simple combinations 
of standard data structures and with a trivial probabilistic analysis. A generic 
algorithm is introduced and proved correct, after which two concrete instantia- 
tions of the generic algorithm, Algorithm A and Algorithm B, are presented and 
analyzed. Algorithm A uses the Fibonacci-heap data structure and is, arguably, 
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simpler to describe than Algorithm B, which replaces the Fibonacci heap by a 
standard binary heap. For the most part, the new algorithms can be viewed as 
simplifications of both previous algorithms due to Meyer and Goldberg. 

2 The Generic Algorithm 

Our goal is to give new proofs of the theorem below, which also follows from the 
work of Meyer [7,8] and Goldberg [5]. The model of computation is assumed to 
allow real numbers to be compared, added, multiplied, divided and rounded to 
their integer parts in constant time. 

Theorem 1. There is an algorithm A with the following properties: Given an 
arbitrary directed graph G = (V, E) with n vertices and m edges and a source 
s G V , if the edges in G are equipped with random edge lengths drawn from the 
uniform distribution on [0, 1], then A solves the instance of the SSSP problem 
defined by the resulting network in 0{n + m) average time and 0{n + m) space. 
If, moreover, the edge lengths are mutually independent and m > 2, the run- 
ning time of A is 0{n -\- m) with probability at least 1 — for every 

constant G > 0. 

Let us fix a directed graph G = (V, E) with n vertices and m edges and a 
source s G G. In proving the theorem, we can assume without loss of generality 
that all vertices in V are reachable from s in G and that m > 2. Let A/” be 
the instance of the SSSP problem obtained by giving each edge e G E a, length 
c(e) G [0, 1]. 

For all v G V, denote by 6{v) the length of a shortest path in Af from s to v. 
It is well-known that knowledge of S(v) for all u G P allows us, in 0(n -h m) 
time, to construct a shortest-path tree of Af rooted at s, i.e., a tree that is the 
union, over all v GV, oi a shortest path in Af from s to w (see, e.g., [1, Section 
4.3]), so our task is to compute i5(u) for all v GV . When / is a function and x 
belongs to the domain of /, we call f{x) the / value of x. 

Define k = [log 2 mJ and Z\ = 1/A:, let Vi be the set of vertices in V that have 
one or more incoming edges of length smaller than A, and take V 2 = P \ Pi- 
Similarly to many other SSSP algorithms, the generic algorithm manipulates an 
upper bound d{v) on S{v) for each v G V. For brevity, let us define d{v) = d{v) 
for all V G Vi and d{v) = \d{v)/A\A for all v G ¥ 2 - For all v G V, we have 
d{v) — A < d{v) < d{v). The generic algorithm is shown in Fig. 1. 

When a vertex u is chosen in line (5), we will say that u is selected. The 
operation d{v) := min{d(r)), d{u) c{u, u)} carried out in line (8) is called a re- 
laxation of the edge (u,v). After the initialization in lines (l)-(3), the algorithm 
selects the vertices in V one by one, always choosing the next vertex as a re- 
maining vertex with minimal d value, and relaxes the edges leaving the selected 
vertex one by one. At this level of abstraction, the only difference to Dijkstra’s 
algorithm is that line (5) chooses u to minimize d{u) rather than d{u). 

Standard arguments show the correctness of the algorithm. Immediately after 
the execution of line (2), d is an upper bound on S, as every vertex in V is 
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(1) d{s) := 0; 

(2) for all V (z V \ {s} do d{v) := n — 1; 

(3) C/:=y; 

(4) while 17 ^ 0 do 

(5) Choose u £ U such that d{u) is minimal; 

(6) U:=U\{uy, 

(7) for all V £ V with ( m , v) £ E do 

(8) d{v) ~ min{d(ii), d{u) + c(u, u)}; 



Fig. 1. The generic SSSP algorithm. 



reachable from s via a path with at most n — 1 edges and therefore of length 
at most n — 1. Since d is subsequently changed only through edge relaxations, 
it is easy to see with induction that it remains an upper bound on <5. Because 
d{w) never increases after the execution of line (2), it suffices to prove that 
d{w) = S{w) holds for every w £ V when w is selected. 

Assume, by way of contradiction, that w is the first vertex to be selected 
when d{w) > S{w). Let (u,v) be an edge on a shortest path in Af from s to w 
such that just before the selection of w, v belongs to U, but u does not. Since 
s ^ U and w £ U at the time under consideration, such an edge exists. The 
selection of u takes place before that of w. Therefore, by assumption, d{u) = 
6{u) holds when u is selected, and just after the relaxation of (m,u), we have 
d{v) < d{u) + c{u, v) = 6{u) + c{u, v) = S{v). The relation d{v) = S(v) still holds 
when w is selected; in particular, v ^ w. 

Since all edge lengths are nonnegative, we have S{w) > S{v). If w G V 2 , 
moreover, S(w) > S(v) + A. When w is selected, we therefore have d(iv) > 
S{w) > S(v) = d(v) and, if w G V 2 , d(w) > S(w) > S(v) + A = d{v) + A. 
Whether or not w £ V 2 > we find d{w) > d{v). Since v £ U when w is selected, 
this contradicts the selection of ic as a vertex in U whose d value is minimal. 

3 Algorithm A 

Algorithm A stores the vertices in [/ fl Vi in a Fibonacci heap [4] with their d 
values as keys. For z = 0, . . . , n/e — 1, the vertices f in [/ fl V 2 with d{v) = iA are 
stored in a set Li implemented as a doubly-linked list, known as a bucket, so that 
insertions, deletions, and emptyness tests can all be carried out in constant time. 
Moreover, for j = 0, . . . , n — 1, the set Ij = {i\ jk <i < {j -I- 1)A: and Li yf 0} is 
stored in a data structure Dj that supports the following operations in constant 
time: Insertion, deletion, test of Ij = 0 and, if Ij yf 0, the computation of 
min Ij . Note that, for j = 0, . . . , n — 1, /j yf 0 exactly if [/ fl V 2 contains at least 
one vertex v with j < d{v) < j + 1. 

Because k < log 2 m, Dj can be represented as a single integer. For ease of 
discussion, consider the case j = 0. A set / C {0, . . . , fc— 1} is represented through 
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the integer ^ 1}. To insert an element i (which is not 

already present), add to Dj. To delete an element i (which is present), 

subtract from Dj. Ij is empty exactly if Dj = 0. And to find min Ij when 

Ij yf 0, finally, compute k — 1 — [log 2 -DjJ . If the available instruction set does 
not allow constant-time computation of 2^“^“* from i or of [log 2 l?jJ from Dj, 
these mappings are easily realized via lookup in tables of size 0{m) that can be 
constructed in 0{m) time and shared among Dq, . . . , Dn-i- 

A final variable is an integer j* that, logically, points to one of the sets Ij 
and that is initialized to 0 (i.e., pointing to Iq). The complete representation of 
the vertices in {7 fl V 2 is illustrated in Fig. 2. 



f 



00000000 




00010111 




10100000 




00000000 




Fig. 2. The representation of f/ D 02- 



We now discuss the implementation of the generic algorithm using the data 
structures introduced above. Line (5) of the generic algorithm is refined as shown 
in Fig. 3. 



(5.1) X ~ min({ci(u) \ v € U H Vi} U { 00 }); 

(5.2) while Ij* = 0 and j* + 1 <x do j* := j* -\- 1; 

(5.3) i* ;= min(/j* U { 00 }); 

(5.4) if i*A < X 

(5.5) then choose u € Li* arbitrarily 

(5.6) else choose u £ U C\Vi with d{u) = x\ 



Fig. 3. The refinement of line (5) in Algorithm A. 



Line (5.1) computes x as the smallest d value of a vertex in {7 fl Vi, or as the 
dummy value 00 if 17 fl lA = 0. If j* < x, line (5.2) repeatedly increments j* 
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until some vertex in U has a d value in [j*,j*+ 1). We claim that subsequently 
min{d(u) \ V £ U} £ [j*,j*+ !)• This would be trivial if j* were set to 0 prior to 
the execution of line (5.2). In reality, however, j* keeps the value that it acquired 
in the last execution of line (5.2), if any, and the “search” takes place only from 
there. That this is also correct follows from the easily verifiable fact that the d 
values of the selected vertices, in the order in which the vertices are selected, 
form a nondecreasing sequence — the relaxation of an edge (u, v) cannot give v a 
smaller d value than that of u, not even \iu &Vi and v €¥2 - Line (5.3) computes 
the integer i* such that the smallest d value of a vertex in [/ fl V 2 is i*A, or 00 if 
Ij, = 0 (which may happen even if [/ fl V 2 yf 0). Finally the vertex u is chosen 
in line (5.5) or (5.6) based on a direct comparison between i*A and x. 

At this point, the implementation of all steps of the algorithm is straightfor- 
ward: The initialization in line (3) calls the Fibonacci-heap operation insert \Vi\ 
times and inserts IV 2 I vertices in appropriate buckets. Lines (5.1) and (5.6) are 
realized through the findmin operation of the Fibonacci heap. Line (6) involves 
a call of the Fibonacci-heap operation deletemin if u G V\, and the removal 
of u from some bucket otherwise. The relaxation of (u, v) in line (8) calls the 
Fibonacci-heap operation decreasekey if w G Vi, and possibly moves v from some 
bucket to another one otherwise. Finally, since the set U is not explicitly main- 
tained, it is convenient to replace the test in line (4) by a construction that 
simply causes lines (5)-(8) to be executed exactly n times. 

The algorithm works in 0{n + m) space, except that, as described so far, it 
needs to store headers of the lists Li in an array A of size nk. The following stan- 
dard observation reduces the overall space requirements to 0{n + m): Because 
every edge “spans” at most k “bucket widths” (i.e., its length is at most kA), 
all nonempty buckets, except for the single bucket that contains vertices v with 
d{v) = n — 1, at all times belong to a (varying) set of fc -I- 1 consecutive buckets. 
Therefore A can be replaced by a “sliding window” of size fc -I- 1, i.e., an array 
of size fc -I- 1 = 0(log n) thought of as cyclic, but cut at a moving position, that 
always represents the segment of A of current interest. 

Outside of calls of Fibonacci-heap operations and of line (5), the time spent by 
the algorithm is easily seen to be 0{n + m). Since the Fibonacci-heap operation 
findmin works in constant time, a single execution of line (5) takes 0(1 -I- h) 
time, where h is the resulting increase in j*. j* remains bounded by n — 1 and 
line (5) is executed n times, so the total time spent in line (5) is 0{n). The 
algorithm executes exactly \Vi\ calls of each of the Fibonacci-heap operations 
insert and deletemin and at most m calls of decreasekey. Since the Fibonacci 
heap has a constant amortized time bound for insert and decreasekey and a 
logarithmic amortized bound for deletemin, the total time spent in Fibonacci- 
heap operations and altogether is 0(n -I- to -I- |0l| logn). 

The quantity \Vi\ is clearly upper-bounded by the number S of edges in 
E of length smaller than A. If the edge lengths are drawn from the uniform 
distribution on [0, 1], the expected value E{S) of S is toZ\, so that the average 
running time of the algorithm is 0(n-|-TO-|-TOZ\logn) = 0(n + m). If, moreover, 
the edge lengths are mutually independent, S is binomially distributed. By the 
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ChernofT bound [6, Eq. (8)], Pr(S' > r) < 2“’’ then holds for every r > 6E{S). 
Using this with r = Cm/log 2 m for arbitrary constant C > 12, we can conclude 
that the running time is 0{n + m) with probability at least i _ This 

ends the proof of Theorem 1 . 

4 Algorithm B 

Because the Fibonacci heap is occasionally viewed with suspicion, in terms of its 
practical performance, this section presents an alternative instantiation of the 
generic algorithm that allows the Fibonacci heap to be replaced by a standard 
binary heap. 

In Algorithm B, the vertices in Vi, rather than being stored separately in 
a Fibonacci heap, are stored together with the vertices in V 2 in the lists Li, 
every vertex v €V belonging to The only exception is that the cur- 

rent bucket, Li», is instead represented via a binary heap H and a list L, both 
initialized to be empty. Line (5) of the generic algorithm is refined as shown in 
Fig. 4 below. In the interest of succinctness, the symbols “if” and “L” are used 
to denote also the sets of vertices stored in the corresponding data structures. 



(5.1) ifi7uL = 0then 

(5.2) while Ij* = 0 do j* := j* 1; 

(5.3) i*:=min/j»; 

(5.4) Represent Li* C\Vi in H and Li* D V 2 in L; 

(5.5) ifL^0 

(5.6) then choose u £ L arbitrarily 

(5.7) else choose u £ H such that d(u) is minimal; 



Fig. 4. The refinement of line (5) in Algorithm B. 



Line (5.1) tests whether the current bucket has been exhausted. If so, lines 
(5.2) and (5.3) establish a new current bucket in a way familiar from Algo- 
rithm A. We already noted earlier that, whenever the current bucket is nonempty, 
it contains a vertex in U with minimal d value. 

In line (5.4), the new current bucket is converted from the list representation 
to the representation in H and L: Those vertices in Li* that belong to Vi are 
stored in H (with their d values as keys), and those belonging to V 2 are stored 
in L (H and L are empty before this step). Future operations on (vertices in) 
the current bucket are executed on H and L. Lines (5.5)-(5.7), finally, choose 
u from L, if possible, and otherwise from H . Since d{u) < d{v) for all u £ V 2 
and V £ V\ with \d{u)/A\ = \d{v)/A\, this implements line (5) of the generic 
algorithm correctly. 

The relaxation of an edge {u,v) may cause v to enter the current bucket, or 
may decrease d{v) while v belongs to the current bucket. The important thing to 
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note, however, is that this can happen only if c(u,v) < A, and therefore not at 
all if n G V 2 . This shows that over the whole execution, H must support at most 
S decreasekey operations, where S is the number of edges in E of length smaller 
than A, in addition to exactly \Vi\ insertions, \Vi\ deletemin operations, and n 
emptiness tests. This needs 0((5' + |Vi|)logn) = 0(5' log n) time. Moreover, L 
must support exactly IV 2 I insertions, IV 2 I deletions, and 2n emptiness tests, which 
takes 0(n) time. Referring to the old analysis of the probability distribution of 
S, we obtain a new proof of Theorem 1 . 

5 Concluding Remarks 

The bound of on the probability of exceeding the time bound of 

0(n + m) established in Theorem 1, for arbitrary fixed C, is not quite as good as 
the bound of 2“'""* that holds for the approaches of Meyer [7,8] and Goldberg [5], 
but certainly amply suffices for all practical purposes. 

As described, the algorithms cannot cope with arbitrary nonnegative edge 
lengths. This is easily remedied: Letting M be an upper bound on the maximum 
finite distance of a vertex in V from s (such as n — 1 times the maximum edge 
length), initialize d{v) to M instead of to n — 1 for all w G V \ {s}, redefine 
A as M/{{n— l)fc) and use nk buckets as before (or take A = M/{mk) and 
use (m + l)k buckets). Slightly more elaborate algorithms work in linear time 
if there is a (possibly unknown) constant e > 0 so that the number of edges of 
length smaller than eM/(m log 2 m) is bounded by m/(elog 2 m). 

The algorithms adapt easily to the case of edge lengths drawn from the 
uniform distribution on some set {a,a + 1, ... ,6} of consecutive integers with 
0 < a < b. The only nonobvious fact to note is that if Z\ < 1, then the Fibonacci 
heap (in Algorithm A) and the binary heap (in Algorithm B) should be ignored, 

i.e., all vertices in V should be considered to belong to ¥ 2 - 

Algorithm A deteriorates gracefully for unfavorable edge lengths towards a 
worst-case running time of 0{m + nlogn) without any need for it to be run in 
parallel with a separate algorithm with good worst-case behavior. If a smaller 
probability of exceeding the time bound or, for integer edge lengths, a better 
asymptotic worst-case running time are desired, these can be achieved by re- 
placing the data structures Dj and/or the Fibonacci heap by more efficient but 
more complicated data structures. 
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Abstract. We give a method for approximating any n-dimensional 
lattice with a lattice A whose factor group has n — 1 cycles of equal 
length with arbitrary precision. We also show that a direct consequence 
of this is that the Shortest Vector Problem and the Closest Vector 
Problem cannot be easier for this type of lattices than for general lattices. 

Keywords: lattices, shortest vector problem, closest vector problem 



1 Introduction 

The interest in the computational complexity of lattice problems started in 
the beginning of the 1980s, when van Emde Boas published the first NP- 
completeness result for lattice problems [16]. Several hardness results for different 
variants of this problems and for different subsets of lattices have followed. One 
such way of classifying lattices is according to the cycle structure of Abelian 
group Z”/A, which is the main focus of this paper. Previous results on the com- 
plexity of lattice problems that either explicitly or implicitly consider lattices 
with a certain cycle structure include [1,3,13,14]. 

There are two reasons to study the hardness of certain lattice problems in 
different subclasses of lattices rather than for general lattices. The first reason 
is purely theoretical — it gives us a better understanding of how the computa- 
tional complexity of lattice problems behaves if we restrict ourselves to certain 
lattice classes. The second reason is more practical — most hardness results are 
worst-case results for general lattices. The lattices that appear in many applica- 
tions may have certain structural properties. It would be desired to have results 
that show that these properties cannot be used to solve lattice problem more 
efficiently. 

The first result on the cycle structure was published by Paz and Schnorr [13]. 
In that paper it is shown that any lattice can be approximated arbitrarily well 
by a lattice with one cycle. In other words, the lattices with one cycle form a 
hard core. On the other hand, the lattices Cai and Nerurkar [3] prove to be hard 
in the improved version of Ajtai [1] have up to n/c cycles. Although the results 
are different in nature (the latter is not an NP-hardness result), it is interesting 
to note that they give hardness results for lattices with different cycle structure. 
This gives rise to the question of the role of the cycle structure in the complexity 
of lattice problems. 
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The influence of the cycle structure on the hardness of lattice problems has 
practical implications. For some crypto systems (e.g., NTRU [6]) there are at- 
tacks based on finding short vectors in certain lattices. The lattices used in some 
of these attacks have a cycle structure that differs from the cycle structure of 
the lattices that previously have been shown to be NP-hard. 

Since a lattice with n cycles always can be transformed into a lattice with 
fewer cycles by a simple rescaling, the maximum number of cycles that is mean- 
ingful to analyze is n — 1. Trolin showed that the exact version SVP under the 
max-norm is NP-complete for n-dimensional lattices with n — 1 cycles of equal 
length [14]. 

In this paper we investigate the importance of the cycle structure further. Our 
main result is a polynomial-time transformation that with arbitrary precision 
approximates any n-dimensional lattice with a lattice that has n — 1 cycles of 
equal length, showing that these lattices form a hard core. A consequence of this 
is that short vectors and close vectors cannot be computed more efficiently in 
this class of lattices than in general lattices, except possibly for a polynomial 
factor. As our transformation only changes the size of the coordinates of the 
basis vectors and not the dimension of the lattice, the transformation is rather 
tight. 

2 Background 

2.1 Lattices 

A lattice is a discrete additive subgroup A C M”. A lattice A can be defined 
by its basis, a set of independent vectors {bi, b 2 , . . . , bm}, bj G K”, such that 
u G A if and only if there exist integers ti,t 2 , ■ ■ ■ Am such that u = 

If m = n the lattice is said to be full-dimensional. Only lattices that are subsets 
of Q” (and often Z") are considered in this paper. For each vector v G K" and 
p>l the £p-norm is defined as ||v||p = The ^oo-norm, also called 

the maximum norm, is defined as ||v||oo = max"_j^ juij. When no index is given, 

l|v|| = ||v|| 2 . 

A basis matrix of a lattice is a matrix whose rows form a basis of the lat- 
tice. The determinant of a lattice is the absolute value of the determinant of a 
basis matrix. For lattices that are not full-dimensional, the determinant is de- 
fined as det(A) = i/det (BB^). It is not difficult to see that the determinant is 
independent of the choice of basis. 

2.2 Basis Representations 

In different situations different bases may be suitable. Two such representations 
are the Hermite Normal Form and LLL-reduced bases. 

A basis {bi,b2,...,b„} is said to be on Hermite Normal Form (HNF) if the 
basis matrix is upper triangular, and bn > bji > 0 for j < i. The Hermite Normal 
Form can be computed efficiently [7]. In [II] Micciancio gives some results on 
the use of HNF in cryptographic applications. 
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An LLL-reduced basis is defined as follows. Every lattice basis {bi,b 2 , . . 
bm} lias an associated orthogonal basis |bi,b 2 , . . . ,bm| defined by 

i-l 

bi = bi - ^ fiijhi 

i=i 

where fj,ij = ^bi,bj^ / bj for i > j. Extending the definition, we let fj,u = 1 

and iiij = 0 for t < j. It holds that = det(A). A lattice basis is 

called LLL-reduced (after Lenstra, Lenstra and Lovasz) with <5, 1/4 < <5 < 1, 

2^2 ^2 

if < 1/2 for 1 < j < i < m and <5 bi_i < bi + bi_i for 

i = 2, . . . ,m. An LLL-reduced basis can be found in polynomial time [10]. 

The two most studied lattice problems are the closest vector problem, CVP, 
and the shortest vector problem, SVP. The input to the closest vector problem 
is a lattice A, y G K" and d > 0. The problem is to determine whether or not 
there exists x G A such that ||y — xjj < d. SVP is the homogeneous variant of the 
same problem, where we want to determine whether or not there exists x G A 
such that 0 < ||x|| < d. As a matter of fact, these are both families of problems, 
since every norm gives a different problem. 

It is known that CVP is NP-complete for any t'^.-norm (including the max- 
norm, £ao) [16]. It is also known that SVP is NP-complete in the ^oo-norm [9] 
and under randomized reductions also for any ti^-norm [2]. It has been shown 
that SVP is NP-hard to approximate within any factor smaller than '/2 under 
randomized reductions [12]. Khot has improved that inapproximability bound in 
£p-norm to for large values of p under randomized reductions [8] and Dinur 
has improved the bound for ^oo-norm to [4]. 

2.3 The Cycle Structure 

In this paper we focus on the role of the cycle structure of a lattice in the 
complexity of lattice problems. The cycle structure is defined as the algebraic 
structure of the group Z”/A for a full-dimensional lattice A. 

Definition 1 (Cycle structure). A lattice A is said to have the cycle structure 
fci X /c 2 X • • • X km, if the additive factor group TA j A ~ x Z^^ x • • • x Z^^ 
and ki divides ki+i for i = 1, 2, . . . , m — 1. 

Cycles of length one are called trivial. In the cases where it is not clear from 
the context we specify whether non-trivial cycles should be considered. A lattice 
with only one non-trivial cycle is called cyclic. Depending on context, it may be 
more convenient to number the cycle lengths in increasing or decreasing order. 

Another way to describe the number of cycles of a lattice is to use a different 
representation of the lattice, namely as a set of modular equations. Every lattice 
can be described in this way. 
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Theorem 1. Let A C Z" he a lattice. Then there exist n- dimensional vectors 
a.i,a. 2 , . ■ a-m and integers &i, & 2 > • • ■ > bm, bi > 1, such that 

A = {x : (ai,x) = 0 mod bi A (a 2 ,x) = 0 mod 62 A ... A (a^jx) = 0 mod bm} ■ 

The essence of this theorem is that any lattice can be expressed as a system 
of modular linear equations whose solutions form the lattice. 

The connection to the cycle structure is that the number of nontrivial cycles 
is m, and the length of cycle i is bi, provided that the system of equations 
has been reduced to minimize the number of equations and that the gcd of the 
coefficients and the modulus is 1 in each equation. 

In the transformations we approximate lattices in Z" with lattices in Q”. 
The standard definition of cycle structure cannot be applied to general lattices 
in Q". Since multiplication by a constant does not affect lattice problems such 
as SVP and CVP, we will define the cycle structure of a lattice A C Q" as the 
cycle structure of kA, where k is the smallest integer such that kA C Z". 

We now state three simple lemmas (proofs omitted) on the cycle structure. 

Lemma 1. Let yl C Z” be a lattice with cycle structure fci x /c 2 x • • • x km- Then 
det(T) = Y{T=lk^■ 

Lemma 2. Let T C Z" he a lattice with cycle structure fci x /c 2 x • • • x (not 
necessarily all nontrivial). Then the lattice t- A has cycle structure t-kixt-k 2 'x 

• • • X t ■ kn 

Lemma 3. Let A Q IT' be a lattice with cycle structure k\ x k 2 x ■ ■ ■ x kn, 
ki > k 2 > ■ ■ ■ > kn. Then the lattice - A has cycle structure ^ x ^ x ■ ■ ■ x ^ . 

Because of the divisibility requirement, the lattice in Lemma 3 is in Z". 
Should kn be greater than one, we can always remove it as shown in the theorem. 
Hence we can assume without loss of generality that the number of cycles is less 
than n. 

2.4 Previous Results on the Cycle Structure 

In [13] the following theorem is proved. 

Theorem 2. Let A C IT be a lattice. Then for every s > 0 we can efficiently 
construct a linear transformation aA,e : T — >■ Z" such that aA,e(A) is a lattice 
and for some integer k 

1. Vu G A : jju - <TA,e(u)/fc|j < £r||u|| 

2. aA,s{A) is cyclic. 

This theorem implies that if we can solve a lattice problem for cyclic lattices, 
we can get an approximative solution for the same problem for any with arbitrary 
precision. In other words, the cyclic lattices form a hard core. 

In his celebrated paper [1], Ajtai showed how to generate lattices with a 
connection between the average case and the worst case of variants of SVP. The 
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lattices in the constructions in Cai’s and Nerurkar’s improved version of Ajtai’s 
result [3] have n/c cycles. Although this result is not an NP-hardness result, 
it raises the question of whether the hardness of lattice problems does or does 
not in general decrease with a higher number of cycles. In [14] it is shown that 
SVP in the maximum norm is NP-complete for lattices with n — 1 cycles, giving 
further evidence that hardness results of lattice problems extend to many cycle 
structures. The result of the current paper gives the main result of [14] as a 
consequence. 

3 The Approximation 

Let A C Z” be an arbitrary lattice. To adapt this into a lattice with n — 1 cycles 
that is arbitrarily close to the original lattice we go through the following five 
steps: 

1 . Inflate the lattice by a factor k and perturb to achieve a lattice with Hermite 
Normal Form of a certain form. 

2. Reduce the sublattice spanned by the first n — 1 vectors of the Hermite 
Normal Form using the LLL algorithm. 

3. Factor the partly reduced basis matrix into two matrices, where the second 
has its determinant equal to one. 

4. Perform modifications to the first matrix to give it n — 1 cycles of equal 
length. 

5. Multiply the two matrices to get a basis for an (n — l)-cyclic lattice that is 
close to the original lattice. 

In Sections 3.1 to 3.4 these steps are described in detail. It is also shown 
that the modifications have the desired effect on the cycle structure. In Section 
3.5 we analyze the disturbance from the perturbation and show that it does not 
move a lattice vector more than a small multiple of the original length. All the 
transformations are linear, and extend through linearity to any point in K". 

Many of the proofs are omitted in this extended abstract. The interested 
reader can find them in the full version of the paper. 

3.1 Acquiring a Lattice with a Good Hermite Normal Form 

For the modification to work we need the lattice to have a Hermite Normal Form 
of a certain form. In this section we describe how we efficiently can modify a 
general lattice slightly to get the Hermite Normal Form we need. 

Let A C Z" be a lattice, and let H be its basis in Hermite Normal Form. For 
the coming steps, we need the basis of the lattice to be of the following form: 



/I 0•• 


0 


Oi \ 


0 1 •• 


0 




0 0 •• 


1 


O-n— 1 


1^0 0 •• 


0 


d y 



B = 



( 1 ) 
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where d = det(yl) and 0 < < d. We show how to perturb A so that we get a 

lattice whose Hermite Normal Form as is in equation (1). The method we use 
is based on the following theorem whose proof can be found in the full version 

[15]. 

Lemma 4. Let H &e a matrix on Hermite Normal Form, i.e., 



^ hii 


hi2 


hi3 


^l(n-l) 


hin 


0 


It-22 


^23 


^2(n-l) 


h2n 


0 


0 


^33 


^3(n-l) 


^3n 


0 


0 


0 




• • 1 


V 0 


0 


0 


0 


hnn 



Then the matrix r(H) given by 



( h\i 


h\2 


hi3 


^l(ra-l) 


hin ^ 


1 


h22 


ll23 


/l2(„_i) 


^2n 


0 


1 


^33 


^3(ra-l) 


^3n 


0 


0 


0 


• . . ^(n— l)(n— 1' 


^(n— l)n 


V 0 


0 


0 


1 


^nn / 



(2) 



has a Hermite Normal Form as in equation (1). The transformation can he 
computed in time polynomial in the size of the input data. 

We also define the transformation when the input is a vector as 



'^A,k 







Z=1 



(3) 



where Ui, U2, . . . , u„ are the rows of U and u'j^, u'2, • ■ • , are the rows of r(fcU). 

As the reader may have noticed, this step actually implies the result from 
[13], although we not only achieve a cyclic lattice, but a lattice whose Hermite 
Normal Form is as defined above. 



3.2 Factoring the Basis 



Now that we have a basis with the Hermite Normal Form we need, we proceed 
by finding a more orthogonal basis and factoring the basis matrix. 

Let the operation p(B) be defined as follows: First the LLL-reduction is 
applied to the first n — 1 vectors of B using d = 3/4, keeping the last vector 
unchanged. Let us call this intermediate step p' . Assuming that the input is a 
basis matrix B of the form (1), this gives a matrix of the form 



/ 


bii 


bi2 


bl(n-l) 


bln 


\ 




^21 


^22 


^2(n-l) 






b 


n-l)l 


■ • 1 


1 

1 


b[n—l)n 




\ 


0 


0 


0 


d 


/ 



(4) 




376 



M. Trolin 



From the LLL-reduced basis the (n — l)’th vector is placed first, keeping the 
internal order of the other vectors. The complete transformation is called p. The 
matrix p(B) can be factored into 




Since the determinant of the right factor is 1, the cycle structure of the product 
only depends on the left factor. This follows since, as pointed out in [13], uni- 
modular transformations do not change the cycle structure. 

3.3 Modifying the Cycle Structure 

Let B; be the left factor in the basis factorization (5) and B^ the right factor. 
We create a new lattice A' by inflating the lattice spanned by B/ by a factor 
Put differently, the matrix • B; is a basis matrix of M . By Lemma 2, 
this lattice has n — 1 cycles of length and one cycle of length 

By modifying the lattice JV slightly, we get a new lattice that has n — 1 cycles 
of length We call the new lattice N' . The modification is defined by the 

function 7': 





(d 
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d 1 
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dA 
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d^ 


d^ 


0 




in{d) = 




0 
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0 ••• 
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d"-3 


0 
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0 ••• 
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0 d”-2 
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V 
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0 


0 ••• 
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0 0 


d"- 


V 



Theorem 3. The lattice N' with basis matrix 7^ (det (B;)) has n — 1 nontrivial 
cycles, each of which has length . 

The proof is given in the full version. 

3.4 Returning to the Original Representation 

Returning to the original representation is just a matter of multiplying by B^. 
Since this does not change the cycle structure (B^ is unimodular), we still have 
a lattice with the required cycle structure. 

We denote the transformation described in Sections 3.2 to 3.4 by 7. More 
precisely. 



7(B)=7;(det(Bi))-B 
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where B; and B^. are the left and right factors of p(B) as in (5) and n is the 
dimension of the lattice. 

We also define the transformation when applied to a vector v = X)r=i 
a lattice A where bi,b2, . . . ,bn is a basis. The transformation is then defined 
as 



7.1 (v) = ^t*b'i 

i=l 

where b'j^, b' 2 , . . . , b^ are the rows of 7(B). 

Since LLL-reduction can be performed in polynomial time p can be computed 
in polynomial time. It is obvious that also 7' and the factorization in B/ and B^. 
require at most polynomial time. Hence 7 can be computed in time polynomial 
in the size of the input data. 

3.5 Completing the Approximation 

Now we have the necessary steps to complete the approximation. Let A C Z" 
be a lattice. Our goal is to prove that for any £ > 0 there exist a transformation 
UA,e and an integer k such that 

1. Vu G Z” : ||u - cr^,e(u)/A:|| < e||u||. 

2. aA,e(A) has n — 1 non-trivial cycles of equal length. 

The transformations we use are TA,k and ^a as described above. Since the 
displacement for these transformations (as we will see) depends on the determi- 
nant, we need to find an appropriate k that makes the determinant large enough. 
In the final approximation we will begin by applying t and then apply 7. This 
composed transformation is called cr/i,e(u) and can be computed in polynomial 
time since both r and 7 can be computed in polynomial time. 

We bound the displacement introduced by the two transformations r and 7 
described above. 

Lemma 5. Let A he a lattice and let TA,k he defined as in (3). Then Vu G Z" : 
||u- fe(u)|| < i2”||u||. 

The proof of this lemma follows the proof in [13] closely and is given in the 
full version. 

We need some bounds on the basis (4) before we can complete the proof. We 
give these bounds as two lemmas. The first lemma shows that the coordinates of 
a vector are bounded in a way similar to Lemma 5, and the second that the basis 
vectors are bounded. The full proofs are omitted from this extended abstract. 

Lemma 6. Let B he the basis matrix of A given on the form (4), let bi, b 2 , . . ., 
b„ be its rows. Assume u = ti^i- Then 



\U\ < 2.”-^||u|| 

for i < n and for any ik-norm (including £00) ■ 



(6) 
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Lemma 7. Let B be a basis matrix of the form (1), and let bi be the row vectors 
of the matrix p(B). Then it holds that 

||bi|| < 



for z = 2 , 3, . . . , n — 1. 

The idea of the proof is that in an LLL-reduced basis B the length of every 
vector except the last one has an upper bound of the order i/det(B). We then 
need to renumber the vectors since the can only afford the first vector to remain 
unbounded in order to bound 7(B). It is essential that the bound is o(det(B)) 
because of the displacement of 7. 

Now we have the necessary tools to find a bound for the transformation 7/1. 

Lemma 8. Let A be an n-dimensional lattice and let 7 a be as defined in Section 
34 . T/ienVuGZ” 



U ; ^-T rr7A(u) 

det(/l)"-2 ^ 

for det(yl) = f2 ^2"^^ . 

Now we combine these two lemmas in order to show a bound for the composed 
transformation crA,e- The proof is omitted. 

Theorem 4. Let A be an n-dimensional lattice. For every choice of e > 0 there 
exist integers k and s, at most of size polynomial in log and n, such that 

the transformation aA,s = 1t^{A) ° ta,s generates a lattice with n — 1 cycles of 
equal length and for any vector u 



rjy/42S"+T 

- 77aair"“" 



U- -cta,£(u) 



< s\M 



4 Applications to CVP and SVP 

In this section we will outline how the transformation can be used to find a 
solution to CVP and SVP, should these problems be easier to solve in lattices 
with many cycles. 

In CVP our goal, given a lattice 4 C Z” and a point y G Z”, is to find x G 4 
such that ||x — y||p is minimized in some £p-norm. If (a slightly perturbed) x 
remains the lattice point closest to (a slightly perturbed) y after the transforma- 
tion, we can reduce the instance of CVP to an instance of CVP in a lattice with 
many cycles. The following theorem shows how to choose the transformation 
parameters. The proof is given in the full version. 
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Theorem 5. Let A C Z”, and let y G Z”. Let x G yl and z G A. Assume that 
all eoordinates are in the interval 0, . . . , det(7l) — 1. Lt holds that if 

l|x-y|lp < ||z-y||p 

then 

-CTA,e(x) - ^cr^.e(y) 
for 

0 < e < {2pn^+^ / P det{ AY+^y" 

and any number p where k is polynomial in e~^. 

The following two lemmas show how to use Theorem 5 to reduce CVP to 
a lattice with n — 1 cycles. The first lemma follows directly from the fact that 
every lattice repeats itself in cubes with side det(yl). 

Lemma 9. Let {A C Z",y G Z”) he an instance of CVP. Then for any u G Z" 
G A is a solution if and only if — det(yl) • u is a solution of the instance 
{A,y- det(T) • u). 

Lemma 10. Let {A C Z”,y G Z”) he an instance of CVP such that 0 < yi < 
det(T). Then x G A is a solution if and only if ^cr/i_e(x) is a solution of the 
instance {^crYA): ^cr^,e(y)) for k and polynomial in det(T) and n. 

Proof. The lemma follows directly from Theorem 5. Using the two lemmas, we 
can construct the reduction by first reducing the target vector modulo det(yl) 
and then apply the transformation with the appropriate value of e. 

Obviously the same technique can be used to achieve a similar result for 
SVP. The following lemma follows directly from the above lemmas. 

Lemma 11. Let A C IP be an instance o/SVP. Then x G A is a solution if and 
only if ^cr/i^e(x) is a solution of the instance ^crYA) for k and e~^ polynomial 
in det(T) and n. 

This leads to the following theorem. 

Theorem 6. SVP is NP-hard to approximate within Y2 — s in £ 2 ~norm for 
n-dimensional lattices with n — 1 non-trivial cycles of equal length. 

5 Conclusions 

We have constructed a transformation that given an n-dimensional lattice of any 
cycle structure produces a lattice with n — 1 cycles that is arbitrarily close to 
the original lattice. This closes the question of whether SVP and CVP can be 
easier to solve in lattices with many cycles. Using the presented result, such a 
solution would give a solution for the general case that is at most a polynomial 
factor slower in running time. Also the known inapproximability results for SVP 
and CVP extend to lattices with n — 1 cycles. 
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By previous results, we know that any lattice can be approximated arbitrarily 
well with a cyclic lattice, and hence that SVP and CVP cannot be easier to 
solve in cyclic lattices than in general lattices, except possibly for a polynomial 
factor. We now have the two extremes, for one cycle and for n — 1 cycles. 

From Ajtai’s and other papers we have a hardness result also for lattices with 
n/c cycles. Together with the results of the current paper, this gives evidence to 
the general hypothesis that the cycle structure have little importance in deciding 
the hardness of a certain lattice. 

Although it does seem likely that also lattices with m non-trivial cycles form 
a hard core for 2 < m < n — 2, we don’t have a proof for this. The current 
proof does not easily extend to these cycle structures. Since our method relies 
on inflating the lattice by a factor d* to get a lattice with determinant and 

then making changes to achieve m cycles, the length of each cycle is 
Naturally t must be chosen so that {nt + l)/m is an integer. In our case, we 
achieve this by setting t = n — 2 and m = n — 1. Since the value of t would 
depend on m and for certain relations between m and n no such t exists at all, 
our method cannot directly be generalized to create any cycle structure where 
the non-trivial cycles have equal length. 

Even if a transformation into m cycles of equal length for 1 < m < n — 1 were 
found it would still be an open question whether other cycle structures, where 
the cycles have different lengths, remain easy. Still the current result seems to 
be a strong indication that the cycle structure does not play an important role 
for the computational complexity of lattice problems. 

Acknowledgments. I would like to thank Johan Hastad for valuable tips and 
ideas in several of the proofs, as well as the anonymous referees for pointing out 
possible improvements. 
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Abstract. Cryptographic protocols can be divided into (1) protocols 
where the protocol steps are simple from a computational point of view 
and can thus be modeled by simple means, for instance, single rewrite 
rules — we call these protocols non-looping — and (2) protocols, such as 
group protocols, where the protocol steps are complex and typically 
involve an iterative or recursive computation — we call them recursive. 
While much is known on the decidability of security for non-looping 
protocols, only little is known for recursive protocols. In this paper, we 
prove decidability of security (w.r.t. the standard Dolev-Yao intruder) for 
a core class of recursive protocols and undecidability for several exten- 
sions. The key ingredient of our protocol model are specifically designed 
tree transducers which work over infinite signatures and have the abil- 
ity to generate new constants (which allow us to mimic key generation). 
The decidability result is based on an automata-theoretic construction 
which involves a new notion of regularity, designed to work well with the 
infinite signatures we use. 



1 Introduction 

In most cryptographic protocols, principals are described by a fixed sequence of 
what we call receive-send actions. When performing such an action, a principal 
receives a message from the environment and, after some internal computation, 
reacts by returning a message to the environment. Research on automatic proto- 
col analysis [23,3,5,19] has concentrated on protocols where a receive-send action 
can basically be described by a single rewrite rule of the form t — >■ t': When re- 
ceiving a message m, the message cr(t') is returned as output provided that cr 
is the matcher for t and m, i.e., a(t) = m. In other words, an input message is 
processed by applying the rewrite rule once on the top-level. We call receive-send 
actions of this kind and protocols based on such receive-send actions non-looping. 
It has been proved that for non-looping protocols when analyzed w.r.t. a finite 
number of receive-send actions and the standard Dolev-Yao intruder where the 
message size is not bounded, security (more precisely, secrecy) is decidable even 
when principals can perform equality tests on arbitrary messages [23,3,5,19], 
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ted by the Deutsche Forschungsgemeinschaft (DFG). 
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complex keys are allowed [23,5,19], and the free term algebra assumption is re- 
laxed by algebraic properties of XOR and Diffie-Hellman Exponentiation [8,12, 
9], 

The main question we are concerned with in this paper is in how far security 
is decidable for protocols where receive-send actions are complex and typically 
involve an iterative or recursive computation; we call such receive-send actions 
and protocols containing such actions recursive. The answer to this question is 
not at all obvious since protocol models for non- looping protocols do not capture 
recursive protocols and there are almost no decidability results for recursive 
protocols (see the related work). 

To illustrate the kind of receive-send actions which are performed in recursive 
protocols, let us consider the key distribution server S of the Recursive Authen- 
tication (RA) Protocol [7] (see also Section 4). In this protocol, the server S 
needs to perform the following recursive receive-send action: The server S first 
receives an a priori unbounded sequence of requests of pairs of principals who 
want to share session keys. Then, S generates sessions keys, and finally sends 
a sequence of certificates (corresponding to the requests) containing the session 
keys. Receive-send actions of this kind are typical for group protocols, but also 
occur in protocols such as the Internet Key Exchange protocol (IKE) — see [18] 
for a description of some recursive protocols. As pointed out by Meadows [18] 
and illustrated in [25,14], modeling recursion is security relevant. 

A natural way to describe recursive receive-send actions is by tree transduc- 
ers, which extend the class of transductions expressible by single rewrite rules 
(with linear left-hand side). More precisely, to study decidability, in Section 2 
we introduce non-deterministic top-down tree transducers (TTACs) with look- 
ahead and epsilon transitions which work on a signature containing an infinite 
set of what we call anonymous constants (ACs), over which the TTACs has only 
very limited control. TTACs can generate new (anonymous) constants, a feature 
often needed to model recursive receive-send actions; in the RA protocol for 
instance, the key distribution server needs to generate (an a priori unbounded 
number of) session keys. 

The main result of this paper is that i) security (for a finite number of receive- 
send actions, atomic keys, and the standard Dolev-Yao intruder where the mes- 
sage size is not bounded) is decidable if receive-send actions are modeled by 
TTACs (Section 5), and that ii) certain features of models for non- looping proto- 
cols cannot be added without losing decidability: As soon as TTACs are equipped 
with the ability to perform equality tests between arbitrary messages, complex 
keys are allowed, or the free term algebra assumption is relaxed by adding XOR 
or Diffie-Hellman Exponentiation security is undecidable (Section 6). 

The undecidability results are obtained by reductions from Post’s Corre- 
spondence Problem. The decidability result is obtained in two steps. First, we 
show that TTACs are powerful enough to simulate the intruder. This allows us 
to describe attacks as the composition of transducers. We can then reduce the 
security problem to the iterated pre-image word problem for a composition of 
TTACs, which we show to be decidable (Section 2.3): Given a term t, a “regular 
set” R of terms, and a sequence of TTACs, the iterated pre-image word problem 
asks whether on input t the composition of TTACs can produce an output in R. 
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Here, “regular set” means the set of terms recognizable by a new kind of tree 
automata, tree automata over signatures with anonymous constants (TAACs), 
which can compare anonymous constants for equality. 

See our technical report [16] for detailed proofs of all results presented here. 

Related work. Recursive protocols have been analyzed manually [22] and semi- 
automatically using theorem provers or special purpose tools [21,6,17]. 

Decidability for recursive protocols has initially been investigated in a previ- 
ous paper [15]. However, there are several significant differences to the present 
paper. Among others, in [15] word transducers, as opposed to tree transduc- 
ers, were considered which do not allow generating new constants (e.g., session 
keys), a common cause for undecidability of security (see, e.g., [2] and references 
therein). Also, the proof techniques employed are completely different (see [16] 
for more details). 

In various papers, automata-theoretic techniques have been applied to the 
analysis of cryptographic protocols (see, e.g., [2] and references therein). How- 
ever, these works aim at non- looping protocols and do not seem to be applicable 
to recursive protocols in an obvious way. To the best of our knowledge, the work 
in [15] and the present work are the first to employ transducers (over infinite 
signatures) for protocol analysis. Automata and transducers over infinite signa- 
tures (although different from those considered here) have been studied in the 
context of type checking and type inference for XML queries with data values 
(see, e.g., [1,20]). 

Basic definitions and notation. If A is a signature, let A„ denote the set of 
symbols in A of arity n. The set of terms over A is denoted T^. For a set C of 
constant symbols (symbols of arity 0) disjoint from A let Ts{C) = T^uC- We 
fix an infinite supply X of variables among which we find xq, x\, X 2 , . . ■ For 
n > 0, we write for the set of all terms in Ts{{xo , . . . , x„_i}). When t 
and to, ... , tn-i are arbitrary terms, we write t[to , . . . , tn-i] for the term which 
is obtained from t by simultaneously substituting fi for Xi, for every i < n. 
A term t G T|) is linear if every xi with i < n occurs exactly once in t. A 
substitution over A is a function a: Ts{X) -G Ts{X) such that for each term 
t, u{t) is obtained from t by simultaneously substituting a(x) for x for every 
X G X. A subset r of Tj; x is called a transduction over A. For t G T^, we 
define r(t) = {t' \ {t,t') G t}. If r and r' are transductions over A, then their 
composition tot' is defined as expected, where the composition is read from 
right to left. Given a transduction r over A and a set R C T^, the pre-image of 
R under r is the set t~^{R) = {t \ 3t'{t' G RA {t,t') G r}. 

2 Tree Automata and Transducers with Anonymous 
Constants 

In this section we describe the models of tree automata and transducers that 
we use, completely independent of the application we have in mind, as they 
are of general interest. (See [11] for more information on tree automata and 
transducers.) 
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A pair {S, C) consisting of a finite signature U, the set of regular symbols, and 
an arbitrary infinite set C of constant symbols, the set of anonymous eonstants, 
disjoint from S is called a signature with anonymous constants. When we speak 
of a term over {S, C) we mean a term over S U C. In what follows, let occc{t) 
(occc(S)) denote the set of elements from C that occur in the term t (the set of 
terms S). 



2.1 Tree Automata over Signatures with Anonymous Constants 

Our tree automata are non-deterministic bottom-up tree automata that accept 
trees over signatures with anonymous constants; they have full control over the 
regular symbols but only very limited control over the anonymous constants. 
These automata are designed in such a way that they are powerful enough to 
recognize tree languages such as the one defined in Example 1 (which is needed 
for our cryptographic application) and such that they fit well with the tree 
transducers introduced in Section 2.2, in the sense that we can prove Theorem 1. 

Formally, a tree automaton (TAAC) over a signature (27, C) with anonymous 
constants is a tuple A = {Q,q‘^ ,q“ , A,F) where Q is a non-empty finite set of 
states, q^^ £ Q is the default state, q^ £ Q is the selecting state, Z\ is a finite set 
of transitions as specified below, and F C Q is a set of final states. The latter 
can be omitted; in this case, we speak of a semi TAAC. 

There are two types of transitions: a consuming transition is of the form 
f{qo, ■ ■ ■ , Qn-i) — f Q where / £ 27„, q,qo, . . . , qn-i £ Q', an epsilon transition is 
of the form q' ^ q where q' ,q £ Q. 

Each TAAC over a signature with anonymous constants (27, C) defines a set 
of terms (also called trees) from Ts{C). To describe this set, we view the set 
Q as a set of constants and introduce the notion of a permitted substitution. A 
permitted substitution ct is a function cr : C -£ {q‘^,q‘^} where at most one element 
of C, the selected element, gets assigned q^ and all the others get assigned q‘^, 
the default value. For t £ Ts{Q) (which does not contain anonymous constants), 
we define [t]A inductively as follows: The set [t]A is the smallest set such that 
a t £ Q, then t £ [^Ja; if t = f(to, . . . , tn-i) and there exist qo, . . . , qn-i such 
that f{qo, . . . , (?n-i) -£ q £ A and qi G [^^Ja for every i < n, then q G [t]A; if 
q G [t]A and q ^ q' £ A, then q' G [t] a- For every term t £ Tj;(CUQ) (which may 
contain anonymous constants), the set [t]A of states which the automaton reaches 
after having read the term t is defined to be the union of all sets [cr(t)]A where 
the union is taken over all permitted substitutions cr. Now, the tree language 
recognized by A is the language T(A) = {t £ Ts{C) \ F fl [t]A ^ 0}- We say 
a tree language over (27, C) is TAAC recognizable over (27, C) if it is recognized 
by some TAAC over (27, C). 

Example 1. Assume 272 = {/} and 27^ = 0 for every i ^ 2. Let T= = {/(c, c) | 
c G C}. This language is recognized by a TAAC with only three states, say qo, qi, 
and < 72 . We choose q'^ = qo, <7® = 9i, F = {< 72 }, and we have only one transition, 
namely f{qi,qi) -£ ( 72 - 
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We will also use what we call a weak TAAC (WTAA) which does not have a 
selecting state, i.e., the default state is assigned to all anonymous constants. 
WTAAs are really weaker because it is easy to see that, for instance, is not 
WTAAC recognizable over (A, C). We can prove the following basic properties 
of TAACs and WTAACs. 

Lemma 1. Let (A, C) he a signature with an infinite set C of anonymous 
constants. The set of TAAC recognizable tree languages over (A, C) is closed 
under union. It is closed under intersection/ complementation over (A, C) iff 
A = Aq U Ai. The word problem and the emptiness problem for TAACs are 
decidable. The set of WTAAC recognizable tree languages over (A, C) is closed 
under union, intersection, and complementation. 

2.2 Tree Transducers over Signatures with Anonymous Constants 

As mentioned in the introduction, we use non-deterministic top-down tree trans- 
ducers with epsilon transitions which have the following specific features: a 
WTAAC look-ahead; generation of new (!) anonymous constants; a register for 
one anonymous constant. 

To define our transducers we need some notation. We fix a signature (A, C) 
with anonymous constants and a finite set S of states, whose elements we view 
as binary symbols. We assume that we are given a set V = {ui?,WAr} of two 
variables for anonymous constants: vn represents the aforementioned register, 
vjv refers to a newly generated anonymous constant. A state term is of the form 
s{z, t) for s G S, z gVUCC {*}, and t G Ts{C U A). The term t is then called 
the core term of this term. If z belongs to some set D C AUCUl*}, then 
we say s{z,t) is a D-state term. Intuitively, a state term of the form s{*,t) or 
s(c, t) with c G C \s part of a configuration of a transducer and means that the 
transducer is about to read t starting in state s where the register does not store 
a value or stores the anonymous constant c, respectively. To describe transitions 
we use state terms of the form s{vn,f), s{vN,t), and again s{*,f), but not s(c, t) 
(see below). 

Formally, a tree transducer (TTAC) over a signature (A, C) with anonymous 
constants is a tuple T = {S, I, A, T) where S' is a finite set of states, / C S is 
a set of inital states, A is a semi WTAAC over (A, C), and A is a finite set of 
transitions. A transition is of the form 

s{z,t) t'[vR,VN,tQ,...,/_/\ (1) 

where q G Q is the look-ahead, s{z,t) is a {?;/{, *}-state term (recall that this 
means that z = VROr z = *) with t G T/. and t linear, t' G Tf/'^ (not necessarily 
linear), and each t' is either a variable xj with j < n or a {z,vn, *}-state term 
with the core term being a subterm of t. If z = *, we require that vr does not 
occur in t'[vR, VN,tQ, ... , If uw occurs in this term, the transition is called 

generative and non- generative otherwise. 

The computation the TTAC carries out is described by a sequence of rewrite 
steps. The corresponding rewrite relation hj/ is defined w.r.t. a subset U Q C of 
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anonymous constants to ensure that newly generated constants do not belong 
to U . Later, U will be the set of anonymous constants in the input term, which 
then ensures that the anonymous constants generated by the TTAC are different 
from those occurring in the input. To define hf/, assume we are given a term 
uo = Mi[s(c, M 2 )] where c € C, a term M 2 = t[to, . . . , t„_i] € Tj;(C), a transition 
T as in (1) with z = Vf{, and assume mi is linear. Let cr be the substitution 
defined by cr(a;i) = tj. Then, for every c' G C \ (occc(uo) U U), if q G [m 2 ]a we 
define Mq h/y Mi[t'[c, c', cr(tg), . . . , cr(f],_^)]]. (Observe that if T is non-generative, 
v/v and U are irrelevant.) Note that the newly generated anonymous constant c' 
does not occur in U and the output term computed so far. The rewrite step in 
case Mo = mi[s(*,M 2 )] is defined analogously where transitions as in (1) can only 
be applied if z = *. 

Let denote the reflexive transitive closure of \-jj- The transduction over 
(L',C') defined by the TTAC is 

Tt = G Tyj{C) X Ts{C) I 3s(s G I^s{*,t) Kccc(t) t')} • 

A transduction r on (17, C) is called TTAC realizable if there exists a TTAC T 
such that Tt = r. Just as for classical non-deterministic top-down tree trans- 
ducers [11], it is easy to see that the set of TTAC realizable transductions is not 
closed under composition. 



2.3 The Iterated Pre-image Word Problem 

The iterated pre-image word problem is defined as follows: 

IteratedPreImage. Given a term t over {S, C), a TAAC B over (L7, C), and 
a sequence of TTACs Tq, . . . , Ti_i over {S, C) with t = ttq o ■ ■ ■ o decide 
whether t G t~^(T(B)). 

The key for proving decidability of this problem is: 

Theorem 1. The pre-image of a TAAC recognizable tree language under a 
TTAC realizable transduction is a TAAC recognizable tree language. Moreover, 
an appropriate TAAC can be constructed effectively. 

To prove this theorem, given a TAAC B and a TTAC T, we provide an expo- 
nential time construction of a TAAC recognizing r,J^(r(B)). Using Theorem 1 
and Lemma 1 (decidability of the word problem), we obtain: 

Corollary 1. IteratedPreImage is decidable. 

3 The Tree Transducer-Based Protocol Model 

We now define our tree transducer-based protocol model by specifying mes- 
sages, the intruder, protocols, and attacks. As mentioned in the introduction, 
the main difference between the model presented here and decidable models for 
non-looping protocols is the way receive-send actions are described — instead of 
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single rewrite rules, we use TTACs. These transducers have two important fea- 
tures necessary to model recursive receive-send actions, but missing in decidable 
models for non-looping protocols: First, they allow to apply a set of rewrite rules 
recursively to a term. Second, they allow to generate new constants — a feature 
not necessary for non-looping protocols when analyzed w.r.t. a finite number of 
receive-send actions. 

Due to the space limit, the exposition below is slightly simplified. For details, 
the reader is referred to [16]. 

3.1 Messages 

The definition of messages we use here is rather standard, except that we allow 
an infinite number of (anonymous) constants. As mentioned, we assume keys to 
be atomic. 

More precisely, messages are defined as terms over the signature (A^,C) 
with anonymous constants. The set C is some infinite set of anonymous con- 
stants, which in this paper will be used to model session keys (Section 4). The 
finite signature A _4 is defined relatively to a finite set A of constants, the set 
of atomic messages, which may for instance contain principal names and (long- 
term) keys. It also contains a subset /C C A of public and private keys which is 
equipped with a bijective mapping assigning to a public (private) key k G 1C 
its corresponding private (public) key k~^ G 1C. Now, denotes the (finite) 
signature consisting of the constants A, the unary symbols hashj, {keyed hash) 
and enc* {symmetric encryption) for every a G A, enc^ {asymmetric encryp- 
tion) for every k G 1C, and the binary symbol () {pairing). Instead of Q{t,t') we 
write {t,t'). We point out that hasha(m) shall represent the keyed hash of m 
under the key a plus m itself. Note that anonymous constants are not allowed 
as keys (see also Section 3.4 and 7). The set of messages over (A_ 4 ,C) is denoted 
M = T^JC). 



3.2 Receive-Send Actions, Principals, and Protocols 

A receive-send action is a TTAC over (A_ 4 ,C). Roughly speaking, a principal is 
defined to be a finite sequence of receive-send actions, where the last action may 
or may not be marked to be what we call a challenge output action (see [16] 
for a precise definition of principals) . The purpose of challenge output actions is 
explained below. A protocol is a tuple consisting of a finite family of principals 
and a finite set S C M., the initial intruder knowledge. 



3.3 The Intruder 

As in the case of models for non- looping protocols, our intruder model is based 
on the well-known and widely used Dolev-Yao intruder [13]. That is, an in- 
truder has complete control over the network and can derive new messages from 
his current knowledge by composing, decomposing, encrypting, decrypting, and 
hashing messages. We do not impose any restrictions on the size of messages. 
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The (possibly infinite) set of messages d(5) the intruder can derive from 
some set 5 C Ad is the smallest set satisfying the following conditions: S C 
d(5); if (m, m') G d(5), then m,m' G d(5) (decomposition); if enc*(m) G d(5) 
and a G d(5), then m G d(5) (symmetric decryption); if enc^(m) G d(5) and 
k~^ G d(5), then m G d(5) (asymmetric decryption); if hasha(m) G d(5), then 
m G d(5) (obtaining hashed messages); if m,m' G d(5), then {m,m') G d(5) 
(composition); if m G d(5) and a G Alnd(5), then enc^(m), hasha(m) G d(5) 
(symmetric encryption and keyed hash); and if to G d(5) and k G /Cnd(5), then 
enc^(TO) G d(iS) (asymmetric encryption). 

We note that although principals have the ability to generate new (anony- 
mous) constants, as they are defined in terms of TTACs, for the intruder adding 
this ability is not necessary since it would not increase his power to attack pro- 
tocols (see [16] for more details). 

3.4 Attacks on Protocols 

In a (successful) attack on a protocol P, the intruder chooses an interleaving 
To,... ,T/_i of the receive-send actions of the principals in P (i.e., a total 
ordering of these receive-send actions) in such a way that i) the last receive-send 
action T;_i in this interleaving is a challenge output action, ii) the intruder can 
produce the input to* for the receive-send actions, and iii) from the messages to) 
returned by the receive-send actions and his initial knowledge S he can derive a 
secret message. The secret message the intruder tries to derive is a regular 
or anonymous constant determined by the challenge output action T;_i and it 
is presented to the intruder as a challenge but not added to his knowledge. In 
the following definition of attack, due to space limitations and for simplicity of 
notation, in this extended abstract we assume that the interleaving of receive- 
send actions is given beforehand. Since the number of possible interleavings is 
finite, from a decidability point of view this is not a restriction (see [16] for a full 
definition of attack). The second condition in the following definition ensures 
that new anonymous constants generated in the ith receive-send action are also 
new w.r.t. the knowledge of the intruder before the tth action is performed. 

Attack. Given a finite set 5 C Ad (the initial intruder knowledge), TTACs 
To,... , T;_i (the interleaving of receive-send actions) withTi = {Si, fi, Ai, Pi) 
for i < I, decide whether there exist messages mi,m'i G Ad, i < I, such that 

1. {mi,m'fi) G T’T^ for every i <1, 

2. {occc{m[) \ occc{rrii)) fl occc(Si) = 0 for every i < I, 

3. mi G d(iSi) for every i < I, and 

4- G d(iS;_i) n (AUC). (Can the intruder derive the challenge?) 

where Si = 5U{too, . . . , m{_fi\ is the intruder’s knowledge before the ith receive- 
send action is performed. 

We write (5 ,Tq, . . . ,T;_i) G Attack if all the above conditions are satisfied. 

The use of challenge output actions, as presented above, allows to determine 
secrets dynamically, depending on the protocol run. This is for example needed 
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when asking whether the intruder is able to derive a session key (an anonymous 
constant, which may change from one protocol run to another) generated by a 
key distribution server. Challenge output actions are somewhat related to the 
way security is defined in computational models for key distribution protocols 
where at the end of an attack, the intruder is presented a string for which he 
needs to decide whether it is an actual session key or just some random string 
[4]. In [16], we discuss alternatives to challenge output actions. 

4 Modeling Recursive Cryptographic Protocols 

To illustrate the TTAC-based protocol model, we now present a formal descrip- 
tion of the key distribution server, called S in the following, of the Recursive 
Authentication (RA) Protocol [7] (see [16] for a complete description and a more 
detailed account of the RA protocol). In what follows, we abbreviate messages 
of the form (toq, . . . , • • •) by mg, . . . ,m„. 

The server S shares a long-term (symmetric) key with every principal and 
performs only one (recursive) receive-send action in a protocol run. In this 
receive-send action, S receives an a priori unbounded sequence of requests of 
pairs of principals who want to obtain session keys for secure communication 
and has to generate certificates for the principals containing the session keys. 
An example of the kind of message S receives is 

m = hashK^ {C, S, Nc, hash^,, {B, C, A^t,, hash^^ 

where Nc, Ni,, and Na are nonces generated by C, B, and A respectively, and 
Kc, Ki,, and Ka are the long-term keys shared between the server S and the 
principals C, B, and A, respectively. The above message consists of three requests 
and indicates that C wants to share a session key with S, B with C, and A with 
B. The symbol ” marks the end of the sequence of requests. It is important 
to note that messages sent to S may contain an arbitrary number of requests — 
which must be processed by S recursively. Now, given m, S processes the requests 
starting from the outermost. First, S generates two certificates for C, namely, 
encf^ {Kcs,C, Nc) and encj^ {Kcb, B, Nc), where Kcs and Kcb are session keys 
generated by S and intended to be used by C for communication with S and B, 
respectively. In the same way, certificates for B and A are generated, where A 
only obtains one certificate (containing the session key for communication with 
B). 

We now describe S by the TTAC Tg. Let Pq, ■ ■ ■ ,Pn be the principals par- 
ticipating in the RA protocol. We assume that = S' is the server. Every Pi, 
i < n, shares a long-term key Ki with S. The transducer Tg has two states, 
start and read, and does not need a look-ahead — we will need a look-ahead to 
model the intruder in terms of a TTAC (Section 5). In state start, the initial 
state, Ts checks whether the first request is addressed to S and initializes the 
process of reading the requests by generating one session key which is stored in 
the register. In state read, the requests are processed. In this phase, the register 
is used to store a session key while moving from one request to the next. 
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The transitions of T 5 are specified as follows: 



start(*, hashxi {Pi, P„, xo, xi)) ^ read(uAr , hashKi {Pi, Pn,xo,xi)) 
read(ufl, hashxi {Pi, Pj, xo, -)) ^ encx^ {vr, Pj,xo) 
read(ui{, hashic^ (Pi, Pj,xo,hash.K., {Pi' , Pi, xi, X 2 ))) encR. {vr, Pj,xo), 

encR.{vN, Pi',Xo), 

read(uAf , hashif^, (Pi/ , Pi, xi, X2)) 



where i,i',j < n and xo,xi,a ;2 are variables which take arbitrary messages, and 
Vr and vjq are the variables for the register and the new anonymous constants, 
respectively. 



5 The Decidability Result 

The main result of this section is the following: 

Theorem 2. Attack is decidable. 

The proof of this theorem proceeds in two steps. In the first step, we show that 
the intruder can be simulated by a TTAC, which we call Tder- We point out that 
for the construction of Tder the use of a look-ahead is necessary to gather all 
information about which keys can be accessed in a given message. In the second 
step of the proof of Theorem 2, we describe attacks as composition of transduc- 
ers Tder,T;_i,Tder, T/_ 2 ,Tder, •••Tder,To,Tder (applied from right to left). 
More accurately, these transducers need to be slightly modified to pass on the 
intruder’s knowledge from one transducer to the next. The (slightly modified) 
transducer T/_i will produce a pair (m, m[_^) where m represents the intruder’s 
current knowledge Si-i and m'i_i is the challenge. Given {m,m'i_i), the trans- 
ducer Trfer (again slightly modified in a similar fashion as above) will now try 
to transform m into i.e., try to derive the challenge from m, without us- 
ing In other words, in the last step Tder tries to produce a pair in the set 

R = {(a, a) \ a £ M}U{(c, c) | c G C}. Using Example 1, it is easy to see that R is 
TAAC recognizable over (Z'_ 4 ,C). The following lemma formalizes the reduction 
from Attack to IteratedPreImage where t is the transduction obtained 
from the composition of transducers just described and ms = (ug, (■ ■ ■ (u„- 2 , 
Un-i) ■ • ■) represents the intruder’s initial knowledge S = {uq) ■ • ■ ,Un-i}- This 
lemma, together with Corollary 1, immediately implies Theorem 2. 

Lemma 2. We have (5,To, . . . ,Ti_i) G Attack if and only if ms G t~^{R). 

6 Adding Features of Models for Non-looping Protocols 
and Undecidability Results 

In the TTAC-based protocol model as introduced in Section 3, many non-looping 
protocols can be analyzed with the same precision as in decidable models for non- 
looping protocols with atomic keys (see, e.g., [3]). More precisely, this is the case 
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for protocols where a) the receive-send actions can be described by rewrite rules 
with linear left-hand side, since TTACs can simulate all such rewrite rules, and 
b) only a finite amount of information needs to be conveyed from one receive- 
send action to the next. This includes for instance many of the protocols in 
the Clark- Jacobs library [10] (see [16] for a formal TTAC-based model of the 
Needham-Schroeder Public Key Protocol). 

However, some features present in decidable models for non- looping protocols 
are missing in the TTAC-based protocol model: i) Equality tests for messages of 
arbitrary size, which are possible when left-hand sides of rewrite rules may be 
non-linear (this corresponds to allowing non-linear left-hand sides in transitions 
of TTACs) or arbitrary messages can be conveyed from one receive-send action 
to another and can then be compared with other messages [3,23,19,5]; ii) complex 
keys, i.e., keys that may be arbitrary messages [23,19,5]; and iii) relaxing the free 
term algebra assumption by adding the XOR operator [8,12] or Diffie-Hellman 
Exponentiation [9] . The main result of this section is that these features cannot 
be added without losing decidability (see [16] for a formal statement and the 
proof, in which we present reductions from Post’s Correspondence Problem): 

Theorem 3. Attack is undecidahle when one (or more) of the above features 
is added to the TTAC-based protocol model. 



7 Conclusion 

The main goal of this paper was to shed light on the feasibility of automatic 
analysis of recursive cryptographic protocols. The results obtained here trace a 
fairly tight boundary of the decidability of security for such protocols. To obtain 
our results we introduced tree automata (TAACs) and transducers (TTACs) 
over signatures with an infinite set of (anonymous) constants and proved that 
for TTACs the iterated pre-image word problem is decidable. We believe that 
the study of TAACs and TTACs started here is of independent interest. 

Our decision procedure for finding attacks on protocols is non-elementary and 
the problem can easily be seen to be EXPTIME-hard. Thus, one open problem 
is to establish tight complexity bounds. While so far we do not allow anonymous 
constants as keys, this would be an interesting extension of our model. In this 
paper, we have identified the computation of pre-images as a means to analyze 
protocols. It is worthwhile to investigate to what extent this method is practical 
and whether it could be an altnerative to constraint solving approaches usually 
employed for the analysis of (non-looping) protocols. 
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Abstract. Motivated by a scheduling problem encountered in multicast 
environments, we study a vertex labelling problem, called Minimum Cir- 
cular Arrangement (MCA), that requires one to find an embedding of 
a given weighted directed graph into a discrete circle which minimizes 
the total weighted arc length. Its decision version is already known to 
be NP-complete when restricted to sparse weighted instances. We prove 
that the decision version of even un-weighted MCA is NP-complete in 
case of sparse as well as dense graphs. 

We also consider complementary version of MCA, called MaxCA. 
We prove that it is MAX-SNPfrr] complete and, therefore, has no 
PTAS unless P=NP. A similar proof technique shows that MCA is 
MAX-SNP[7r]-Hard and hence admits no PTAS as well. Then we prove 
a conditional lower bound of \[2 — e for MCA approximation under 
some hardness assumptions, and conclude with a PTAS for MCA on 
dense instances. 

Keywords: Computational complexity, hardness of approximation, 
polynomial time approximation scheme, scheduling, multicast. 



1 Introduction 

Availability of very high-speed and large bandwidth networks, explosion in inter- 
networking, and advent of cheap, low-power, portable computing devices have 
given rise to one-to-many asymmetric communication networks and huge client 
populations having commonality of interests [1]. In such environments, servers 
are endowed with much more computing power and have access to much larger 
bandwidth than clients. Therefore, it becomes cost effective to push data from 
server side rather than follow traditional client-server based pull model. This can 
be achieved using multicast where a server needs to send a data unit only once 
to reach arbitrary number of clients. 

A common way to use multicast in data dissemination is to use server ini- 
tiated repetitive multicast where a server cyclically multicasts data to a large 

* Full version of the paper is available online [12]. 
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client population. This finds application in many diverse domains, e.g., high- 
throughput database systems [15], data management in broadcast disks [1], in 
solving scalability problems of heavily loaded Web servers [3] , content delivery 
networks (CDNs) [22], etc. 

A fundamental question is the order in which the server should multicast 
data, that is, scheduling. In general, clients are seldom interested in individual 
data items, and attempt to download multiple items. For example, Web clients 
hardly ever access only one HTML resource, but access almost always the HTML 
document along with all its embedded images [17]. Database clients often access 
multiple items to complete a read transaction [24]. Thus client access patterns 
often show dependencies between consecutive requests, so that the request for 
a data unit will make it more likely or less likely that certain data unit will be 
requested next. These access patterns must be taken into account while design- 
ing a good cyclic multicast schedule that has low client-perceived latency while 
accessing multi-item objects [19]. 

One way to model this scenario is to treat the server data set as a weighted 
directed graph where nodes represent server data units and arc weights represent 
the strength of the dependency. Then the scheduling problem becomes following 
question in combinatorial optimization: 

Minimum Circular Arrangement (MCA): Given a directed weighted graph 
G = (V,E,w) with non-negative weights, find a surjection f :V i— >-{0,l,...,n — 
1} which minimizes XiesB "''^here £{e) = (/(u) — f(u)) mod n, for 

e = {u, v). Note that £{e) is called the latency of the edge e in the arrangement /. 

1.1 Related Problems 

The MCA problem first appeared in work of Liberatore [19]. It falls under the 
class of vertex labelling problems where the question is to find a labelling of the 
vertices which optimizes some cost function. This class includes many interesting 
practical problems [7], e.g., optimal linear arrangement problem, directed opti- 
mal linear arrangement problem, minimum bandwidth problem, folding labelling 
(also called minimum cut linear arrangement) problem, etc. We give more con- 
sideration to optimal linear arrangement problem and directed optimal linear 
arrangement problem since MCA is very closely related to them. 

Optimal Linear Arrangement (OLA): Given an undirected weighted graph 
G = (V, E, w) with non-negative weights, find a surjection f :V i— >■ {0, 1, . . . , n — 
1} which minimizes "''^here £{e) = [/(r:) — f{u)\, for e = (u,v). 

OLA problem naturally arose from applications in VLSI design. Carey, John- 
son and Stockmeyer [14] proved NP-completeness of the decision version of OLA. 
Today we know how to solve OLA problem exactly for some special cases of 
graphs, e.g., un-weighted trees [25,8], outer planar graphs [10], cycles, wheels, 
complete bipartite graphs [16], etc. For arbitrary graphs, the currently best 
known guarantee of 0(logn)-approximation is due to Rao and Richa [23]. Mean- 
while, there has also been some work done on polynomial time approximation 
schemes for un-weighted OLA of dense graphs, namely, [4] and [11]. 
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Directed Optimal Linear Arrangement (DOLA): Given a directed acyclic 
weighted graph G = (V, E, w) with non-negative weights, find a surjection / : 
1^ !->■ {0, 1, . . . , n — 1} such that {u, v) £ E => f{u) < f{v), i.e. a topological 
sort, which minimizes w(e)£(e), where £(e) = f{v) — f{u), for e = (u,v). 

Not much is known about DOLA. Its decision version was shown to be NP- 
complete by Even and Shiloach [9]. On the algorithmic front, Adolphson and Hu 
[2] gave an 0(nlogn)-time algorithm to solve DOLA exactly on rooted trees, 
where all the edges are oriented towards (or away from) the root. Otherwise 
neither any approximation algorithms nor any hardness of approximation results 
are known for it. 



1.2 Current Status 

The MCA problem is pretty recent [19]. Very few theoretical results are known 
about it. One of them is the proof of NP-completeness for the decision version 
of MCA problem restricted to sparse weighted graphs by Liberatore [18]. He 
also demonstrates an 0(-yn)-approximation algorithm on any arbitrary graph 
instance using divide-and-conquer strategy in [18]. This result has recently been 
improved by Naor and Schwartz in [20] where they present O(lognloglogn)- 
approximation algorithm for the MCA problem. 

1.3 Our Results 

In this paper, we start out by proving some preliminary lemmas in section 3 
that bound MCA cost. In section 4, we draw comparison between MCA cost 
and OLA cost (DOLA cost), throwing light on the relative hardness of these 
problems. 

We prove that the decision version of even un-weighted MCA is NP-complete 
in case of sparse as well as dense graphs (section 5), a stronger result than [18]. 
We also consider complementary version of MCA, called MaxCA in section 6. 
We prove that it is MAX-SNP[7 t] complete [21] and, therefore, has no PTAS 
unless P=NP [5] . A similar proof technique would then show that MCA is MAX- 
SNP[7r]-Hard and hence there is no PTAS for MCA too [5]. In section 7, we 
prove a conditional lower bound of \/2 — e for MCA approximation under the 
assumption that DOLA does not admit constant factor approximation. Finally 
we conclude with a PTAS for MCA on dense instances in section 8. 

2 Notation 

By a graph G, we mean a directed graph without parallel edges and loops. V 
and E, as always, stand for vertex-set and edge-set of G respectively. \V\ = n. 
An un-weighted graph is considered as a graph with edges of unit weight . When 
we talk about the OLA problem on a directed graph we mean the OLA problem 
on the underlying undirected graph. 
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Definition 1. A graph G is dense if \E\ = I7(n^). More specifically, G is S- 
dense if \E\ > 5n^ . Similarly G is sparse if \E\ = 0{n). 

Consider a graph G and f : V {1, n} an arrangement of G. 

Definition 2. An edge e = (u,v) of G is said to be a forward edge with 
respect to f if f{u) < f{v). Similarly e is a backward edge with respect 
io f if f{u) > f{v). Note that the forward/backward status of an edge can be 
changed by rotating f. 

Let MCOST(/), LCOST(/) and DCOST(/) respectively denote the circular, 
linear and the directed linear cost of the arrangement / as defined in the problem 
definitions. For an edge e, MCOSTe(/) denotes the cost of the edge e in the 
arrangement /. Similarly for DCOSTe(/) and LCOSTe(/). Set DCOSTe(/) = 
DCOST(/) = oo, if any edge e = (u,v) is a backward edge. 

Definition 3. Let g be a circular arrangement of G. By ROT(g) we mean an 
arrangement h obtained by rotating g so that the total weight of the backward 
edges is minimized. Note that MCOST(ROT(g)) = MCOST(g). 

Let MCA(G) be the set of all optimal circular arrangements of G. Simi- 
larly define OLA(G) and DOLA(G). Sometimes, by abuse of notation, MCA(G) 
also stands for some optimal circular arrangement. Similarly for OLA(G) and 
DOLA(G). Finally, let MCOST(G) := MCOST(MCA(G)) denote the cost of the 
optimal arrangement. Similarly for LCOST(G) and DCOST(G). 

By Pm we mean a directed path pi, . . . ,Pm+i of length m (on m-l- 1 vertices), 
with unit weight edges. By 1^ n we mean the complete directed acyclic graph on 
n vertices, i.e. for 1 < i < j < n, there is an edge (i, j) of unit weight. 

By G we mean the graph anti-parallel to G, that is, V{G) = V{G) and 
E{G) = {(w, u)|('u, v) G E}. The edges in G carry same weight as their coun- 
terparts in G. 

If G and El are two graphs, then G -\- El is the graph which has G and H as 
its two components. 



3 Bounding MCA Cost 

In this section we show some upper and lower bounds on MCA cost and highlight 
its peculiar features that would help us derive our hardness results. 

Proposition 1. The total weight of the backward edges o/ROT(g) < 

Proposition 2. Let G = Hi Hk be a graph with k components. Put 

n = |y(G)| and n* = \V{H,)\. Then X;Li MCOST(iL,) < MCOST(G) < 
nY^'^=i^COST{Hi) /ui. Moreover these inequalities are tight. 



Proof (sketch). For 1 < z < fc, let G MCA(iLi) and consider g = ROT(/i) o 
•••oROT(/fe). □ 
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This behavior of MCA (with components) enables us to derive our hardness 
results. The fundamental difference between MCA and OLA (or DOLA) is the 
issue of connectedness. In case of a graph with more than one component it is easy 
to the see that the optimal arrangement (in case of OLA and DOLA) is obtained 
by concatenating the optimal arrangements of the components. However, such is 
not the case with MCA. If there are any backward edges in the optimal circular 
arrangement of one of the components, then the latency of that edge is increased 
due to the presence of the other components. 

Definition 4. Let G he a weighted directed graph. For a vertex u, let w\ > ■ ■ ■ > 
Wd denote the weights of the outgoing edges from u. Define A+(m) = '^^iwi, 
and X+(G) = X]«ev(G) Similarly define X~{u) and X~{G) by replacing 

outgoing with incoming. 

Proposition 3. MCOST(G) > max{A+(G), A“(G)}. 

4 Comparison of MCA with OLA and DOLA 

4.1 Comparison with OLA 

Proposition 4. For any graph G, LCOST(G) < 2(1 — 1/n) • MCOST(G). 

Proof (sketch). Let / G MCA(G) be an optimal circular arrangement. Denote 
by fi the arrangement got by rotating / by i-positions. Now consider any edge 
e = (u, v) of weight w with latency p with respect to the / ordering. The cost 
of this edge in the linear arrangement ft is pw if fi{u) < fi{v) and (n — p)w if 
fi{u) > fi{v). Averaging over all n rotations settles the claim. □ 

To see that the above result is tight, consider G = G„, a directed cycle. 
On the other hand, MCOST(G) < (n — l)LCOST(G) trivially. By considering 
appropriately directed sunflowers, one can show examples [12] which achieve 
MCOST(G) > (n/12)LCOST(G). 

4.2 Comparison with DOLA 

From the definition, any legal DOLA arrangement is a legal MCA arrange- 
ment. Hence we trivially have MCOST(G) < DCOST(G). On the other hand, 
DCOST(G) is trivially < (n — l)MCOST(G). In case of weighted graphs this 
is optimal as shown in [19]. We can get little more sophisticated bound if we 
restrict ourselves to un- weighted graphs. 

Proposition 5. Let G be an un-weighted directed acyclic graph and f be any 
DOLA arrangement ofG. T/ien DCOST(/) < \E\n—‘^^\E\^J\E\+'^\E\. More- 
over for interesting E (i.e. \E\ > 2S), DCOST(/) < \E\n— \E\ y^\E\/2. 

Proof (sketch). Note that in any legal DOLA arrangement of G, there can be at 
most n — i edges with latency i for each i. □ 
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Corollary 1. For any un-weighted directed acyclic graph G, 

DCOST(G) < MCOST(G) • . 

Thus in order to get a I7(n) separation between DCOST(G) and MCOST(G), 
we only need to look at sparse graphs in the un-weighted case. Moreover any 
approximation algorithm for DOLA on dense graphs automatically yields an 
approximation algorithm for MCA on dense graphs. However an approxima- 
tion algorithm for MCA does not apriori give rise to a DOLA approximation 
algorithm since an MCA arrangement need not be a legal DOLA arrangement. 

5 NP Completeness 

Theorem 1 (Proposition 3.1 in [18]). The decision version of the MCA 
problem is NP-complete. 

Liberatore [18] proves that weighted MCA problem is NP-complete by a re- 
duction from an un-weighted DOLA. Since the MCA instance in his proof has 
\E\ = 0(|P|), we infer that MCA is NP-complete even when restricted to sparse 
graphs. In this section we prove that even un-weighted MCA is NP-complete in 
case of sparse as well as dense graphs, stronger result than Liberatore [18]. We 
too make use of reduction from an un-weighted DOLA. 

5.1 Straightening Algorithm 

We start with an algorithm which allows us to normalize optimal solutions in a 
special case. 

Theorem 2 (Straightening Algorithm). Let G be a weighted directed graph, 
and m> 2. Let f be any circular arrangement of G Pm- We can transform f 
(in time polynomial in m-\- n) to an arrangement g in which all the vertices in 
Pm appearing in the order pi, . . . ,pm+i- Moreover MCOST(g) < MCOST(/). 

Proof (sketch). For this proof it is more convenient to think of an arrangement 
as a mapping from [n] to P or as an ordered list of vertices, rather than the 
other way around. Let / be any circular arrangement of G -I- Pm- We define a 
sequence of arrangements gi, . . . , gm-ei with the following properties: 

~ 9i{j) = Pj for all 1 < j < i < m -I- 1 
- MCOST(g,+i) < MCOST(g,) for 1 < i < m 

Thus g = gm+i is the required arrangement. To start, let gi = f suitably 
rotated so that (/i(l) = pi- Note that MCOST(gi) = MCOST(/). Assume we 
know gi and i < m (else we are done). If gi{i-\- 1) = Pi+i, then set gi+i = gi and 
continue with the next i. 
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Suppose Qi{i + 1) ^ Pi+i- Let i + £ denote the position of the vertex pi+i 
{2 <£ <m + n — i). Partition the vertices as follows: L = {pi, . . . ,Pi}, M = 
{gi{i + 1), . . . ,5i(i + £ - 1)}, R = {gi{i + £+!),.. .,g{m + n + 1)}. Thus the 
arrangement gt is L M Pi+i R. 

Let Wmr be the total weight of all the edges going from M to R and Wrm 
be the total weight of all edges going from R to M. Define 

_iLpi+iMR iiWMR>WiiM 
“ [Lp^+iRM if Wmr < Wrm 

The verification that MCOST(pi_|_i) < MCOST(gi) is left to the reader. □ 

Note that g = p\P 2 ■ ■ -Pm+i ° ROT(/|G), where /|G is the arrangement / re- 
stricted to vertices in G. Hence this transformation can be implemented in time 
0{m + n^). A generalization of this is proved in [18]. 

Proposition 6 (Lemma 3.4 of [18]). Let G = Hi+- ■ -+Hk he a directed graph 
with k components. Then there is an optimal circular arrangement of G which 
can be obtained by concatenating (not necessarily optimal) circular arrangements 
ofH,. 

We now have a corollary of Theorem 2 which gives us a technique to force 
an optimal MCA arrangement to have only forward edges. 

Corollary 2. Let G he an un-weighted directed acyclic graph, m > DCOST(G). 
Let g be the circular arrangement obtained by concatenating Pm with the opti- 
mal DOLA arrangement of G. Then MCOST(G -I- Pm) = DCOST(G) -I- m 
= MCOST(g), i.e. g is an optimal circular arrangement. 

Proof (sketch). First note that MCOST((/) = DCOST(G) -\-m < 2m. Moreover, 
G cannot have any backward edges in the straightened optimal circular arrange- 
ment of G -I- Pm, for that implies the cost of the arrangement is > 2m. □ 

We conclude this section with a couple of NP Completeness proofs of un- 
weighted MCA. 

Theorem 3. The decision version of the un-weighted MCA problem is NP- 
complete. 

Proof. Proof by reduction from un-weighted DOLA. Let (G, K) be a DOLA 
instance. Let m = he an upper bound for cost of optimal DOLA arrangement. 
By Corollary 2, MCOST(G -I- Pm) = DOLA{G) -\- m. So if G' = G -I- Pm and 
K' = K + m,vfe have DOLA{G) < K 4=^ MCOST(G -k Pm) < K' ■ □ 

Since the MCA instance in this proof has \E\ = O(jyj), we infer that un- 
weighted MCA is NP-complete even when restricted to sparse graphs. We now 
prove a generalization of Corollary 2 and use it to show that un-weighted MCA 
is NP-complete even when restricted to dense instances. 
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Proposition 7. Let G and H be un-weighted directed acyclic graphs such that 
\V{H)\ = m > DCOST(G). Assume further that there is an optimal MCA ar- 
rangement h of H which does not contain any backward edges. Let g be the cir- 
cular arrangement obtained by concatenating h with the optimal DOLA arrange- 
ment of G. Then MCOST(G + H) = DCOST(G) + DCOST(i7) = MCOST(g), 
i.e. g is an optimal circular arrangement. 

Proof (sketch). First apply Proposition 6 to separate out vertices of G and H 
in the optimal arrangement. Then rearrange the H portion to be h, since h has 
no backward edges. Then proceed as in Corollary 2 to show that G cannot have 
any backward edges. □ 

Theorem 4. Decision version of the un-weighted MCA is NP-complete even 
when restricted to dense instances. 

Proof (sketch). Proof by reduction from un-weighted DOLA. Let (G,K) be a 
DOLA instance. Let m = he an upper bound for cost of optimal DOLA 
arrangement. Then consider MCA instance G' = G -I- l^m, where is the 
complete DAG on m vertices, and use proposition 7! □ 

6 MAX-SNP[7r] and PTAS 

Papadimitriou and Yannakakis [21] show that the complementary version of 
OLA, called Maximum Linear Arrangement, is in MAX-SNP[7 t]. We show that 
same is true for the following complementary version of MCA as well. 

MaxCA Given a directed graph G, find an arrangement / that maximizes 
MCOST(/). 

Theorem 5. MaxCA is in MAX-SNP[tt]. 

Proof (sketch). To show that MaxCA is in MAX-SNP[7 t], consider the first-order 
quantifier-free predicate tp{Tr,u,w,v,C) := B{u,w,v) A ((u,v) € E{G)), where 
B{u,w,v) is true when w = v, or w occurs in between u and v in tt order 
(considered cyclically). □ 

6.1 MAX-SNP[7t] Completeness of MaxCA 

We prove that MaxCA is complete for MAX-SNP[7 t] by showing a L-reduction 
[21] from the following MAX-SNP[7 t] complete problem. This problem is, in fact, 
the complementary version of minimum feedback arc set problem [13] and it does 
not admit a PTAS unless P=NP as shown in [5] . 

MAX SUBDAG: Given a directed graph G = (V,E), find a subset E' C E of 
maximum cardinality for which (V, E') is acyclic. 

In [12], we prove the following theorem. 

Theorem 6. MaxCA is MAX-SNEfir] complete. 
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6.2 MCA Is MAX-SNP[7 t] Hard 

We now show that MCA cannot have a PTAS by showing a similar reduction 
from MAX SUBDAG. In fact, this reduction can be easily modified into a L- 
reduction. It, then, proves that MCA is MAX-SNP[7r]-Hard problem. But we 
prefer to put the proof in algorithmic form, since our main goal is to prove that 
MCA has no PTAS unless P=NP. 

Theorem 7. MCA does not have a PTAS unless P=NP. 



Proof (sketch). Suppose AI is a (1 + e) -approximation algorithm for MCA for 
some 0 < e < 1/3. 



Require: A directed graph G. 

Ensure: F C E for which (V, F) is acyclic. 
1 : 



3 

4 

5 

6 



f 4= SA{f). (5A is the straightening algorithm) 

F 4= edges of G which are forward edges in the arrangement f. 
Output F. 



It can now be shown that |F| > (1 — 3e)|F*|, where F* C E denotes the 
optimal solution to the MAX SUB DAG problem. This gives us a (1 — 3e)- 
approximation for MAX SUBDAG. Thus a PTAS for MCA gives a PTAS for 
the MAX SUBDAG. In the light of [5], this implies P=NP. □ 



7 Hardness of Approximation 

We now turn to hardness of approximation. We use the straightening algorithm 
to prove a curious hardness result for MCA. 

Proposition 8. Suppose that un-weighted DOLA has a polynomial time a - 
approximation algorithm and un-weighted MCA has a polynomial time (1 -I- <5)- 
approximation algorithm (5 < \). Then un-weighted DOLA has a polynomial 
time pL{a)- approximation algorithm, where 

t{v) = (1 + '^)+ P- 

Proof (sketch). Let D denote the a-approximation algorithm for un-weighted 
DOLA, and M denote the (1-1- <5)-approximation algorithm for un-weighted 
MCA. Let 5A denote the straightening algorithm of Theorem 2. Put 0 = (1 -I- 

s)/a-6). 

Require: Input un-weighted directed graph G. 

Ensure: g is a /^-approximate DOLA arrangement of G. 

1: f 4=22(0). 

2: m-^ 6l-DCOST(/). 
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3: gl 4= A4(G + Pm)- 

4: g2 SA{gi). 

5: g 4= g2 restricted to G. 

6: Output g. 

It can be shown that DCOST(G) < DCOST(g) < /i(a) • DCOST(G). □ 

One can view the above algorithm as a way of generating a /x(a)-approximate 
arrangement given the cost (we don’t need the arrangement) of an a-approximate 
arrangement. This leads to the following theorem. See [12] for complete proof. 

Theorem 8 (Bootstrapping). Suppose that un-weighted MCA has a poly- 
nomial time (1 + S)- approximation algorithm for some 5 < \/2 — 1. Put 
r{S) = 1+ i_ 2 S-s 2 ■ Then for every e > 0, there is a polynomial time (P(<5) + e)- 
approximation algorithm for un-weighted DOLA. 

As a corollary we have the following conditional hardness result. 

Corollary 3. For all rj G {0,^/2 — 1), there is a constant c^i such that it is 
NP-hard to approximate un-weighted MCA to within \/2 — rj if it is NP-hard to 
approximate un-weighted DOLA within Cjj. 

8 Polynomial Time Approximation Scheme 

We conclude with a PTAS for un-weighted MCA on dense graphs. Arora, Frieze 
and Kaplan [4] give a PTAS for OLA on dense graphs. We show how the same 
algorithm with minor modifications works for MCA as well. The algorithm gives 
an arrangement which is at most away from the optimal solution. If the 
graph is dense, then Proposition 3 shows that the optimum value is C(n^). 

Definition 5. For constant t, let h, , It he a partition of [n] := {1 , . . . ,n} 
into consecutive equal sized intervals, such that It = {it , . . . , (t -I- l)t — 1}. A 
placement is a mapping from the vertex set to the set {Ii, . . . , It}. A placement 
f is proper if |/^“^(/i)| = |/i| for each i. Given any mapping f \ V ^ [n], 
we denote by f the induced placement. The cost of a placement f', denoted by 
CP(/') is defined to be E(u,«)gb(/'(^) “ /'(^) t). 

Proposition 9. If f is any arrangement, |MCOST(/) — CP(/')n/t| < /t. 

Proof (sketch). Consider any edge which crosses an interval. If it has latency x 
w.r.t the arrangement /, then it has latency \ xt/n\ or {xt/n-\-l\ in the placement 
/'. This observation together with a generous upper bound on the number edges 
within an interval gives the result. □ 

Proposition 10. If f and g are arrangements such that \ CP(/') — CP((/')| < 
en^, then \ MCOST(/) - MCOST( 5 )| <{2 + e)n^/t. 

Now proceed just like in [4]. See proof details in [12]. 
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9 Discussion 

We studied the MCA problem in this paper. Its motivation came from a problem 
related to design of cyclic multicast schedule. Considering current trend in tech- 
nologies and applications, cyclic multicast that pays heed to data dependencies 
should play a pivotal role in the future [6] . 

Our research pointed out certain negative aspects of the MCA problem, 
namely, it does not have a polynomial time algorithm and it does not even ad- 
mit a polynomial time approximation scheme for arbitrary graph instance (unless 
P=NP). Yet it is possible that MCA problem might be tenable if restricted to 
certain special kinds of graphs that have practical significance. Literature has 
many such instances of polynomial time algorithms for OLA problem, e.g., un- 
weighted trees [25,8], outer planar graphs [10], wheels, complete bipartite graphs 
[16], etc. Can one hope for the same in case of MCA? Or is it also too hard! 

Assuming DOLA to be non-approximable within any constant factor, we 
could show a lower bound of y/2 — e for MCA approximation. We believe it 
to be far from being tight. In fact, there is a conspicuous lack of hardness of 
approximation results even for OLA and DOLA. They stand as natural open 
problems. 

Liberatore provides few heuristics [19,18] and a 0(y/n) -approximation al- 
gorithm [18] to solve MCA problem on arbitrary graphs. Similarly Naor and 
Schwartz present 0(lognloglogn)-approximation algorithm in [20]. But these 
algorithms suffer either from no performance guarantee or from inherent ineffi- 
ciency. Therefore it is an interesting open question to design an efficient approx- 
imation algorithm for MCA problem. 
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Abstract. We study integral 2-commodity flows in networks with a spe- 
cial characteristic, namely symmetry. We show that the Symmetric 2- 
Commodity Flow Problem is in P, by proving that the cut criterion is a 
necessary and sufficient condition for the existence of a solution. We also 
give a polynomial-time algorithm whose complexity is GCfiow -|- 0(|A|), 
where Cfiow is the time complexity of your favorite flow algorithm (usu- 
ally in 0(|E| X |A|)). Our result closes an open question in a surprising 
way, since it is known that the Integral 2-Commodity Flow Problem is 
NP-complete for both directed and undirected graphs. This work finds 
application in optical telecommunication networks. 



1 Introduction 

Given a graph G = {V,A), a capacity k : A — >• N and a request set R £ 
(V X V X N), the Multi-Commodity Flow Problem (MCF) consists in finding 
|i?| flows corresponding to the request set and with respect to the capacity 
constraints on the graph. MCF has been widely studied, as it arises naturally 
from many classical problems such as routing problems. The Fractional Problem 
(allowing fractional flows) can be solved in polynomial time by using linear 
programming ([9]). However, the Integral Problem (not allowing fractional flows) 
is also of interest when we have non-splittable units of traffic, or non-splittable 
routes to find (e.g. for synchronous communications). 

The integral MCF is NP-complete in the general case ([4] and [6]), and many 
variants have been studied, depending on the properties of G, R and G -I- i? 
{G + R is the multigraph obtained from G by adding an arc for each request in 
R). They divide themselves between NP-complete and tractable problems. One 
variant is the Disjoint Paths Problem which consists in finding |i?| disjoint paths 
corresponding to the request set (with R C {V x V)). The other main variants 
depend on the structure of G or i? or both: whether G is directed or undirected, 
whether G or G -I- i? is planar, whether G or G -I- i? is Eulerian, etc... For a 
planar and Eulerian graph G and demands on the boundary, the problem has 
been proven polynomial ([7]) and a linear time algorithm has been found ([12]). 

Since G is also Eulerian, the integral symmetric MCF would appear to be 
a particular case of the directed Eulerian variant, except that usually G -I- i? 
is assumed to be Eulerian (and not only G). On this variant, C.Nash-Williams 
proved in 1965 (one can find a proof in [10]) that with ji?| = 2 the problem was 
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polynomial, whereas with |i?| > 3 the problem was NP-complete ([H])- In the 
general directed case, the Disjoint Paths Problem is NP-complete with |i?| = 2 

([ 4 ]). 

MCFs in Symmetric Digraphs are motivated by routing in optical networks, 
since optical networks are best represented by symmetric digraphs. Hence, the 
interconnection graph G = {V,A) is symmetric directed (i.e. (x,y) G H 
{y,x) G A), and the capacity function is symmetric (i.e. \/{x,y) G A, n{x,y) = 
n{y, x)). Such restrictions apply also to many other telecommunication networks. 

In this paper, we will more specifically study the problem with symmetric 
requests (so G -I- i? is indeed Eulerian, but |i?| = 4). One can observe that a 
symmetric digraph with symmetric requests has the same knowledge structure 
as an undirected graph with undirected requests, and that an undirected solution 
would perfectly fit for the symmetric problem. While this is true, the symmetric 
problem allows also more solutions and is more tractable : [2] proved that the 
Integral Undirected 2-Commodity (i.e. |i?| = 2) Flow Problem was NP-complete 
even with a value of 1 for the first commodity. In this paper we will prove that 
the Integral Symmetric (i.e. with a symmetric capacity and symmetric requests) 
2-Commodity Flow Problem is polynomial. 

In the course of our algorithm, we use standard cut considerations to build 
or ensure the existence of simple flows. We also exploit the symmetry of the 
problem to swap opposite flows without breaking the constraints. We give below 
a diagram of the complexity of flow and paths problems. We denote the number 
of vertices and arcs by N and M, respectively. 





Graph 


Problem 


Directed 


Undirected 


Directed Symm. 


Disjoint paths 








1 request [Dijkstra] 


0{M -h NlogN) 


0{M -h NlogN) 


0{M -h NlogN) 


2+ req. 


NP-hard [4] 


0{N^) [8] 


O(N^) [5] 


k req. 


NP-hard [6] 


NP-hard [6] 


NP-hard [1] 


Integral MCF 








1 req. [3] 


G flow 


C flow 


2 req. 


NP-hard [2] 


?? 


3-1- req. 


NP-hard 


NP-hard [2] 


1 pair of symm. req. 


NP-hard 


- 


Cfiow [1] 


2 pairs of symm. req. 


NP-hard 


- 


QCflow + 0(M) 


3-1- pairs of symm. req. 


NP-hard 


- 


?? 



In section 2 we introduce our notations and our problem. In section 3 we 
consider related works. In section 4 we give our algorithm and prove it, thus 
proving our main theorem: 

Theorem 1 (symmetric 2-commodity flow). The symmetric cut criterion 
is a necessary and sufficient condition for the existence of a solution to the 
Integral Symmetric 2-Commodity Flow Problem. A solution can he found in 
6C flow + 0{M) steps, where Cfiow is the time complexity of a chosen flow algo- 
rithm. 
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2 Standard Notations 

In the following, we present some notations that will be used throughout this 
paper. 

Note that the standard definition of a flow allows for the existence of loops 
and we shall specify “flow without loops” when required. A /c-commodity flow 
(/i, . . . , fk) is a collection of flows sharing the capacity k of the support graph. 

One of the main notion related to flows et MCFs is the cut criterion. Such a 
criterion express some global constraints on the quantities of flow, with respect 
capacities of the support graph. 

Definition 1 (cut criterion). Let k : A ^ N be a eapacity on G = {V,A). 
Let si, ..Sk, t\, ..tk € V. Let v\, ..Vk G N. The eut eriterion for (k, (si, ti, ui), .. 
(sk,tk,Vk)) is : for all cut C C V, for all L C { 1 , ..k] we have : (Vt € I , Si € C 
and U ^ C) ^ 

In the following, we define some symmetric notions related to the properties 
of symmetric digraphs. 

Definition 2 (symmetric function). Let G = (V, A) he a symmetric digraph, 
and let f be a function from A to N. We say that f is symmetric if for all 
(x,y) G A, f(x,y) = f(y,x). 

Note that, in the MCF on symmetric digraphs, the capacity is assumed to be 
symmetric. Now we introduce reverse functions. 

Definition 3 (/”, reverse). Let f : A ^ N. The reverse function of f , /” : 
A — >■ N zs the function defined by \/{x,y) G A, f^(x,y) = f{y,x) 

To finish with the use of symmetric digraphs, we add a last operation on 
functions: 

— simplification: given a function / from A to N, |/| : A — >■ N is the function 
defined by V(a;, y) G A, if f{x, y) > f{y, x), then \ f\{x, y) = f{x, y) - f{y, x), 
otherwise \ f\{x,y) = 0 . 

Symmetric 2-commodity flows. If an undirected 2 -Commodity Flow represents 
the solution of a communication instance between two pair of nodes, then for 
each pair of nodes the same set of paths is used for both directions. When we 
relax the problem, i.e. when we allow the “returning” messages to use another set 
of paths than the “incoming” messages, then we have a symmetric 2 -commodity 
flow. 

Definition 4 (symmetric 2-commodity fiow). Let (/i,/_i,/2,/_2) be a 
4~ commodity flow from (51,^1,52^2) to (^1,51,^2,52) of value {vi,v\, V2,V2). 
(/i, /_i, /2, /_2) is also called symmetric 2-commodity flow from (51,52) to 
of value (vi,V2). 

For the Symmetric 2 -Commodity Flow Problem, we use a specific cut crite- 
rion, called symmetric. 
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Definition 5 (symmetric cut criterion). Let k : A ^ N be a symmetric 
capacity. Let SiAi, S 2 A 2 G V- Let Vi and V 2 be two positive integers. The 
symmetric cut criterion for (k, (si, ti, Ui), (s 2 , i' 2 )) is the cut criterion for 

{k, (si,tl,Vl), (ti,Si,Wi), (s2,t2,V2), (t2,S2,V2)). 

3 Related Flow Formulations 

The main result we will use in this paper is a straightforward corollary of 
Monger’s theorem. 

Theorem 2 (Menger). The cut criterion is a necessary and sufficient condi- 
tion for the existence of a solution to the Lntegral Flow Problem. 



Corollary 1. The cut criterion is a necessary condition for the existence of a 
solution to the Lntegral k-Commodity Flow Problem. 

This corollary implies for instance that the symmetric cut criterion is a nec- 
essary condition for the Symmetric 2-Commodity Flow Problem. 



3.1 Complexity of Some Integral Multi-commodity Flow Problems 

As we said in the Introduction, our problem is a particular case of the general 
Integral MCF. Since the result of Fortune, Hopcroft and Willie [4] we know that 
the Disjoint Paths Problem is NP-complete with only two requests (|i?| =2). 

More specifically, we know that in Eulerian digraphs (G -I- i? is Eulerian), 
the problem is polynomial with 2 commodities (|i?| = 2). This is because if one 
finds a flow for the first request, and removes the used capacities along with the 
request in G -I- i?, then the remaining request is part of some cycles in G -I- i?, 
so there is solution for it. This property no longer holds if there is two requests 
left. The problem is NP-hard with three (|i?| = 3). In our problem, we have not 
less than four requests |i?| =4. 

Concerning undirected graphs, it is noteworthy that the Disjoint Paths Prob- 
lem is polynomial with a bounded number of requests (|i?| < k) [8], like in 
symmetric digraphs, but the 2-Commodity Flow Problem (with |i?| = 2) is 
NP-complete even if one of the requested flows should be of value 1 (with 
i? = {(x, y, 1), (x', y', u)}) [2], unlike in symmetric digraph (see below). 



3.2 The 2-Commodity Flow Problem in a Symmetric Digraph 

In symmetric digraphs, we already know that the Disjoint Path Problem is 
NP-complete [1] in general, but that it is polynomial with a bounded num- 
ber of requests [5]. For the MCF, we prove in the following theorem that 
the 2-Commodity Flow Problem is polynomial with a value of (l,x) (with 
R= l),(x',y',x)}). 
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Theorem 3 . The 2- Commodity Flow Problem is polynomial in symmetric di- 
graphs if the value is (l,w). 

Proof. We assume that the cut criterion is true (it can be checked in polynomial 
time), then find the flow of value v, and last And the flow of value 1 . 

The input is: 

— a symmetric capacity k : A — >■ N ; 

— the source and target of the requested 2-commodity flow (si,S2), (^1,^2) 

G . 

— the value of the requested 2 -commodity flow (l,w) G N^. 

We first And a flow without loops /2 < k from S2 to t2 of value v. If the cut 
criterion is true, then /2 can be found using any polynomial time flow algorithm. 
Now we prove that the cut criterion is true for {{k — /2), (si,ti, !)• Let C CV 
be a cut such that si G C and ti ^ C. Two cases are possible: 

— if S2 G C and ^2 ^ C, then n{C) > (1 -I- v). If /2(C) > u -|- 1 , then there is 

X £ C and y ^ C with (x,y) G A such that > I so f2{x,y) = 0 : 

(«-/2)(C)>l 

~ otherwise, k{C) > I. If /2(C) > I, then there is a; G C and y ^ C with 
{x, y) £ A such that f2{y, x) > I so f2{x, y) = 0 : {k- 72) (C) > I 

The cut criterion is true, so according to Theorem 2 , we can And a flow fi < 
{k — 72) from Si to ti of value 1. 

The 2 -Commodity Flow Problem in a Symmetric Digraph is closely related 
to our problem, though its formulation breaks somehow the symmetry of the 
model. Here we give few remarks on it. 

The cut criterion is not a sufficient condition for the 2 -Commodity Flow 
Problem, though it is necessary according to Theorem 2 . Here is an example 
(see Figure 1 ) showing that the cut criterion is not sufficient: 

— the vertex set is P = {si, S2, ti, f2} ; 

-the arc set is A = {(si, S2), (s2, Si), (s2, ti), (ti, S2), (si, ^2), (^2, Si), 

{h, h), {ti,t2)} ; 

— we consider the symmetric digraph G = (V,A) ; 

— the capacity on G is k : H — >• N such that 

«:(si,S2) = k(s2,Si) = 1 K{s2,h) = K{ti,S2) = 1 
K{si,t2) = K{t 2 ,Si) = 3 n{t2,tl) = K{ti,t2) = 1 - 

The cut criterion is true for (k, (si, G, 2 ), (s2, fa, 2 )), but there is no 2 - 
commodity flow from (si,S2) to (^1,^2) of value (2,2). 

However, observe that the symmetric cut criterion is false when we study the 
example given before: the cut {51,^2} is of value 2 though si and ^2 are in the 
same side of this cut. 

A direct consequence of our main theorem (Theorem 1 ) is that the symmetric 
cut criterion is sufficient for the 2 -Commodity Flow Problem. Observe that it is 
not necessary, as shown with the following example (see Figure 2 ): 
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— the vertex set is 1/ = {x, y} ; the arc set is ^ = {{x, y), {y, x)} ; 

— we consider the symmetric digraph G = (V, A) ; 

— the capacity on G is k : ^ N with k{x, y) = n{y, x) = 1 ; 

— we define a fiow / < k by /(x, y) = 1 and f{y, x) = 0. 

The symmetric cut criterion is false for (k, (x, y, 1), (j/, x, 1)), but (/,/’’) is a 
2-commodity fiow from (x,y) to (y,x) of value (1, 1). 



X* 




-• Y 



Fig. 2. There is a 2-commodity flow from (x,y) to (y,x) of value (1, 1). 



4 Solution to the Symmetric 2-Commodity Flow Problem 

In this section, we consider a symmetric digraph G = (V,A) with a symmetric 
capacity /t : 4 — >• N. We also consider two pair of vertices (si,ti) and ( 32 ,^ 2 ), 
and two positive integers vi and V 2 - The goal of this section is to prove the 
following theorem by construction. 

Theorem 1 (symmetric 2-commodity flow). The symmetric cut cri- 
terion is a necessary and sufficient condition for the existence of a solution to 
the Integral Symmetric 2-Commodity Flow Problem. A solution can he found by 
Algorithm 1 in GGfiow + 0(|4|) steps. 

We give an algorithm (Algorithm 1), that solves the symmetric 2-commodity 
flow problem. This algorithm is divided into three parts, which are explained in 
the three next subsections. 

In Section 4.1 we give the first part of our algorithm (Algorithm 3 (first step)) 
which gives a flow for the first commodity which does not alter the cut criterion 
for the second commodity. Note however that this intermediate flow is not part of 
the final solution. Then, in Section 4.2 we compute the flows for the second com- 
modity (Algorithm 4(second step)), reusing the partial solution of Algorithm 3. 
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These two flows are part of the final solution. Eventually, (Section 4.3) we com- 
pute the flows for the first commodity (Algorithm 5 (third step)). This last part 
is quite straightforward. 

Algorithm 1 (integral symmetric 2-commodity flow) 

Input : G = (V,A) ; k : A — >• N ; si, ti, S 2 , t 2 € V ; vi,V 2 € N. 

— using Algorithm 3 (first step), compute a flow f from si to t\ of value 
v\ such that the cut criterion is true for {{k — /), (s2, t2, V2), (^2, 525^2)) 

— using Algorithm f (second step) and f, compute two flows f2 from S2 to 
t2 of value V2 and f-2 from t2 to S2 of value V2 such that (/2 -I- f-2) < n 
and such that the cut criterion is true for {{k — f2~ f-2), (si,ti,ui)). 

— using Algorithm 5 (third step), compute two flows fi from si to t\ of 
value vi and /_i from t\ to si of value vi such that (/1 + /-1 + /2 + /-2) < k. 

Output ; /i : A — >• N ; /_i : A — >• N ; /2 : A — >• N ; f -2 : A — >• N. 

In the course of the following section, we will need a subroutine (Algorithm 2) 
which, split a flow / of even value on a directed graph G = (V, A) in two flows 
g and g' (this means that g + g' = f) so that 2g < {f + 1) and 2g' < (/ -|- 1). 
In other words, it computes a flow g such that {f — 1) < 2g < {f + I) (by 1, we 
mean the function such that Vo € A, 1(a) = 1). 

Algorithm 2 (finding g such that (f — 1) < 2g < (f + 1)) 

Input : G = (V,A) ; f : A^ N. 

Variables : f' : A ^ N ; x G V . 

— \/a G A, f'{a) G- f{a) 

— while there is {x start -.V start) G A such that f {Xstart,ystart) is odd do 

* f start ,y start) f start , d start) T 1 

* ^ y start 

* while X ^ Xstart do 

* if there is y G T+(a;) such that f{x,y) is odd then 

■ f'{x,y) G- f{x,y) + 1 

■ x^y 

* else choose y G G~(x) such that f'{y,x) is odd 

■ f'{y,x) G- f'{y,x) - 1 

■ x^y 

— Ma G A, define g{a) = 

Output ; g : A — N. 

Lemma 1 (half flow). Let f be a flow from s to t of even value 2v. There is 
a flow g from s to t of value v such that (/ — 1) < 2g < {f + 1). This flow can 
he computed by Algorithm 2 in 0(|A|) steps. 

The proof of Lemma 1 is technical. It can be found in the full version of this 
paper. 
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4.1 Finding a Preliminary Flow 

In this subsection, we describe an algorithm that finds a fiow / such that the 
cut criterion is true for {{k — /), (s2, t2, V2), (^2, S2, ^2)). 

Algorithm 3 (first step) 

Input : G = {V, A) ; k : A — >■ N ; si, ti, S2, t2 ; V\,V2 & N. 

— compute a flow h < k from S2 to t2 of value V2 

Let be the reverse flow from t2 to S2 of value V2 (see definition 3 ). 

— compute a flow g < {k + — h) from si to t\ of value v\ 

— compute a flow g' < {k + h — /i'’) from Si to t\ of value v\ 

— using Algorithm 2 , compute a flow f from si to t\ of value v\ such that 

2/ < (Iff + g'\ + 1) 

Output : f : A ^ N. 



Lemma 2 (first step). Let k : A ^ N be a symmetric capacity. Let 
S2A2 C y and vi, V2 G N such that the symmetric cut criterion is true for 
{K,{si,ti,vi),{s2,t2,V2)). Then there is a flow f < K from si to t\ of value vi 
such that the cut criterion is true for ((k — /), (S2A2, ^^2), (^2, S2, W2))- This flow 
can be computed by Algorithm 3 in 3 Cfiow + 0 (|A|) steps. 

We divide the proof into two parts : first the algorithm completes, and then the 
output is correct. 

Proposition!. If the symmetric cut criterion is true for (k, (si, ti, wi), 
(s2, ^2, ^2)), then Algorithm 3 completes in iCfiow + 0 (|A|) steps. 

Proof. In the case of the existence of the flows, then any integral fiow algorithm 
can find them. 

— if the symmetric cut criterion is true, then the fiow h does exist. 

— in Ford and Fulkerson’s algorithm, the capacity {k + h'^ — h) is called the 
residual capacity once h has been computed. Since the cut criterion is true 
between {si,S2} and {^1,^2}, one can increment the fiow between {si,S2} 
and {^1,^2} by Vi and so find a fiow g from si to ti of value vi with the 
residual capacity. That is g < {k + h^ — h). 

— in the same manner, (k + h — h'^) is the residual capacity once removed the 
fiow h'', so g' can be found. 

— according to Lemma 1 , / is found by Algorithm 2 . 

This algorithm computes 3 flows, calls Algorithm 2 one time and computes two 
capacity functions, so this algorithm takes SCfiow + 0 (|A|) steps. 

Proposition 2. If the symmetric cut criterion is true for (k, (si, ti, r^i), 
(s2, ^2, 'C2)), then the flow f given by Algorithm 3 is such that the cut criterion 
for {{k - /), (s2,t2,V2), (t2,S2,V2)) is true. 
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Proof. First, observe that {g + g') < 2 k, so we have f < k. Before considering 
(k — /), we will give two lower bounds to the function (k — \g + g'\) : A — >■ Z. 
Those bounds will enable us to control {k — f) over a cut. 

We computed g' in order to have g' < {k+ h — h^) ; thus we have {k — g') > 
{h^ — h) and this leads to (k — (g + g')) > (/i’’ — h — g). Now, consider (x, y) & A 
such that g{y,x) > 0 : 

~ if Iff + ff'l(a^,ff) = g'{x,y) - g{y,x), then {k - \g + g'\){x,y) = n{x,y) - 
ff'(x, y) + g{y, x) so {k - \g + g'\){x, y)>{h''-h + g'’)(x, y). 

— otherwise, |g + g'\{x,y) = 0 , so (k - |g + g'\){x,y) = n{x,y). Since g < 
K+h'^ — h, we have K,{y,x) > g{y,x) + h{y,x) — h{x,y), so (k— | g+g'|)(x, y) > 

- h + g'’){x,y). 

Since g{x, y) and g{y, x) can not be non null at the same time, we have (k— |g + 
ff'l) > (ft,’’ — /i — g + g’’). By a symmetric argument, we have as well («;— |ff+ff^|) > 
(ft-ft’’-g' + g”’). 

With these bounds, we can now consider a cut C C V with, for instance, 
S2 G C and t2 ^ C. We will prove that (k — f)(C) > V2- We know that (k — |g + 
ff'D(C) > (ft - ft’’ - g' + g'mC), so (k - |g + g'|)(C) > ^2 + {g'^ ~ g'){C). 

— if G C or Si ^ C, then g'(C) < g'^{C) and (k — |g + g'|)(C) > V2 so 
{k- f){C) > V2). 

— otherwise, si G C and ti ^ C so g'{C) = g'''{C) + vi and (k— |g + g'|)(C') > 
V2 — v\. In this case we have also jg + g'KC) = |g + g'|’’(C') + 2 x 1 and f{C) = 
f{C)+vx,so f{C) < |g + gT(C') + i’i which implies /(C) < |g + g'|(C) -xi. 
Thus (k — /)(C) > X2 

Therefore, the cut criterion is true for ((k — /), (s2,t2,V2)). By a symmetric 
argument, it is also true for ((k — /), (^2, S2, 1^2))- 

4.2 Finding Two of the Four Flows 

Although the cut criterion is true for ((k — /), (s2A2,V2), {t2, 82,02)), it may not 
be sufficient to guarantee the existence of a flow ft from 82 to t2 of value V2 and 
a flow ft' from t2 to S2 of value X2 such that (/ + ft + ft') < n. That is why / is 
not part of the final solution. 

However, / is useful to compute the final flows /2 and /_2 for the second 
commodity. 

Algorithm 4 (second step) 

Input : G = (V, A), k : A — >• N ; S2, ^2 G F ; X2 G N ; / : A — >• N. 

Let /’’ he the reverse flow from t\ to si of value vi . 

— compute a flow h < {k — f) from 82 to t2 of value V2 

— compute a flow h' < {k — /’’) from 82 to t2 of value V2 

— using Algorithm 2 , compute a flow f2 from 82 to t2 of value V2 such that 

(|ft+ft'|-l)< 2 / 2 <(|ft+ft'| + l). 

Let ff2 = + h' \ — f2) ■ We call f -2 the reverse flow from t2 to 82 of value V2- 

Output ; /2 : A — >■ N ; /_2 : A — >■ N. 
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Lemma 3 (second step). Let k : A — >■ N &e a symmetric capacity. Let 
si,ti, S2,t2 G V and V\,V2 G N. Let f < k be a flow from si to t\ of value 
vi such that the cut criterion is true for {{k — f)fls2,t2,V2)flt2,S2,V2))- Then 
there are two flows f2 from S2 to t2 of value V2 and /_2 from t2 to S2 of 
value V2 such that (/2 + f-2) < k and such that the cut criterion is true for 
{{k — f2 — /-2)> (si) ^1) Vi)). These two flows can he computed by Algorithm 4 in 
‘^Cfiow + 0 ( 1 ^ 1 ) steps. 

We divide the proof into three parts : first the algorithm completes, then (/2 -I- 
f-2) < K, and finally the cut criterion is true for ((k — f 2 — f-2), (si, 

Proposition 3 . Lf the cut criterion is true for ((k — /), (s2, ^2,^2), (^2, S2, V2)), 
then Algorithm 4 completes in 2 Cfiow + 0 (|Gl|) steps. 

Proof. If the cut criterion is true for {{k — /), (s2,t2,V2), (^2, S2, 't'2)), then the 
flows h and h' do exist. \h+h'\ is a flow from S2 to ^2 of value 2x-U2, so according to 
lemma 1 , the flow /2 can be computed by Algorithm 2 . This algorithm computes 
2 flows, calls Algorithm 2 and computes 4 functions so it completes in { 2 Cfiow + 
0 (|A|)) steps. 

Proposition 4 . The flows f2 and f-2 given by Algorithm 4 are such that (/2 + 
f-2) < K. 

Proof We know that 2/2 < {\h + h'\ + 1) and 2/^2 ^ (|/i + /i'| + 1 ). Since 
\h + h'\ < 2k, we have f2 If n and f-2 < k. Moreover, (/2 -I- ff2) = \h + h'\ 
implies that V(x,y) G A, if f2{x,y) > 0 , then ff2{y,a:) = 0 , so f-2{x,y) = 0 . 
Thus (/2 + f-2) < K. 

Proposition 5 . If f < k is a flow from Si to ti of value vi, the flows f2 and f-2 
given by Algorithm 4 are such that the cut criterion for {{k— f2~ f-2), (si, ti, wi)) 
is true. 

Proof. Consider a cut C CV such that si G C and t\^C. First we will prove 
that i\h + h'\{C) + \h + h'\{V\C)) < {2k{C)-2vi). This implies {{f2 + f-2){C) + 
(/2 + f-2){V\C)) < (2k{C) — 2 vi). Then we will prove that (/2 + /_2)(C) < 
(k(C) -ui). 

We split \h + h'\ into two functions : g (the part related to h) and g' (the 
part related to h') so g = min(ft,, \h + h'\) and g' = min(ft,', \h + h'\). We call 
r the symmetric function r = {h + h') — \h + h'\. Observe that if g{x,y) > 0 
then g'{y,x) = 0 and if g'{x,y) > 0 then g{y,x) = 0 . Moreover, for every (x,y), 
g{x, y) + r{x, y) + f{x, y) < k{x, y) and g'^{x, y) + r{x, y) + f{x, y) < k{x, y), so 
we are able to conclude that {g + g'^ + r) < {k — f). Since h and /i"’ are flows that 
go in opposite directions, we have {h{C) — h{V\C)) = (h'^{V\C) — h'^{C)). This 
implies {h + h'^){C) = {h^ + h'){C), so {g + g'^ + 2 r){C) > {g^ + g'){C). It follows 
that {g + g'^ + g^ + g'){C) < 2 {g + g'^ + r){C), thus \h + h'\{C) + \h + h'\{V\C) < 
{2k{C)-2vi). 

Since (/2 + /I2) = we have as well (/2 + /!l2)(C) + (/2 + /-2)(^\C') < 

{2k{C) - 2 vi), so (/2 + f-2){C) + (/2 + f-2){V\C) < {2k{C) - 2 v^). 

Like h and h' , the flows /2 and f-2 go in opposite directions, so (/2 + 
f-2){C) = (/2 + f-2){V\C). Therefore (/2 + /_2)(C) < {k{C) - v^). 
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4.3 Finding the Last Two Flows 

Once /2 and /_2 have been properly computed, the algorithm to find fi and /_i 
is quite straightforward. 

Algorithm 5 (third step) 

Input : G = {V, A) , K : A ^ N ; Si,ti G V ; Vi G N ; f2 ■ A ^ N ; /_2 : A — >• N. 

— compute a flow /i < (k — /2 — f-2) from si to t\ of value vi 

Let /_! = (k — f I — f 2 — f-2) 

Output ; /i : A ^ N ; /_i : A ^ N. 



Lemma 4 (third step). If (/2,/-2) is a two-commodity flow from (52,^2) to 
(0,52) of value (v2,V2) such that (/2 + /-2) < o,rLd if the cut criterion is true 
for {{k — f2 — f-2),(si,ti,vi)), then Algorithm 5 completes in Cfiow + 0(|A|) 
steps, and its output are two flows f\ from Si to t\ of value V\ and /_i from t\ 
to Si of value Vi such that (/i + /_i + /2 + f-2) < k. 

Proof. If the cut criterion is true for ((k — f 2 — f-2), (siAii^^i))) then the flow 
fi does exist. So Algorithm 5 takes Cfiow + 0(|A|) steps. Now observe that the 
function (k — /i — /2 — f-2) : A — >• N is a fiow from ti to si of value vi (see 
definition ??), though it may have some loops (which could be easily removed 
in Cfiow more steps). 



4.4 Summary 

Theorem 1 (symmetric 2-commodity flow). The symmetric cut criterion is 
a necessary and sufficient condition for the existence of a solution to the Integral 
Symmetric 2-Commodity Flow Problem. A solution can he found by Algorithm 
1 in 6Cfiow + 0(|A|) steps. 

Proof. According to corollary 1, the symmetric cut criterion is a necessary con- 
dition for the existence of a solution to our problem. The input is: 

— a symmetric digraph G = (V, A) ; 

— a symmetric capacity : A — >■ N ; 

— the source and the target (si,S 2 ), (^ 1 ,^ 2 ) G ; 

— the value (vi,V 2 ) G N^. 

If the symmetric cut criterion is true for (k, (si,ti,vi), (s 2 ,t 2 ,V 2 )) then according 
to Lemma 2, Algorithm 3 takes SCfiow + 0(|A|) steps ; according to Lemma 3, 
Algorithm 4 takes 2C flow + 0(|A|) steps and according to Lemma 4, Algorithm 
5 takes Cfiow + 0{\A\) steps ; so Algorithm 1 completes in (SC fiow + 0{\A\) steps. 

Moreover according to Lemmas 2, 3 and 4 the 4-commodity flow (/i,/_i, 
f 2 , f- 2 ) computed by Algorithm 1 is a symmetric 2-commodity flow from (si, S 2 ) 
to (^ 1 ,^ 2 ) of value (vi, U 2 ) such that (/i -I- /_i -\- f 2 + f- 2 ) < si, so it is a solution 
to the problem. 
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5 Conclusion 

We have proven that the Integral Symmetric 2-Commodity Flow Problem can 
be solved in polynomial time. Interesting ways for further research deal with 
the complexity of the Integral Symmetric MCF (likely NP-complete), and with 
Integral Symmetric fc-Commodity Flow Problems, with k > 2. 

Further, if the requests are not symmetric, the complexity of Integral k- 
Commodity Flow Problems in symmetric digraphs (with a symmetric capacity) 
is also open for k > 1. 
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Abstract. The paper studies the problem of computing a minimal en- 
ergy cost range assignment in a ad-hoc wireless network which allows a 
station s to perform a broadcast operation in at most h hops. The gen- 
eral version of the problem (i.e., when transmission costs are arbitrary) is 
known to be log-APX hard even for h — 2. The current paper considers 
the well-studied real case in which n stations are located on the plane 
and the cost to transmit from station i to station j is proportional to 
the a-th power of the distance between station i and j, where a is any 
positive constant. A polynomial-time algorithm is presented for finding 
an optimal range assignment to perform a 2-hop broadcast from a given 
source station. The algorithm relies on dynamic programming and oper- 
ates in (worst-case) time O(n^). Then, a polynomial-time approximation 
scheme (PTAS) is provided for the above problem for any fixed h > 1. 
For fixed h > 1 and e > 0, the PTAS has time complexity 0(n^) where 
p = 0((a2“fe“/e)“'*). 
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1 Introduction 



Multi-hop wireless networks [13] require neither fixed, wired infrastructure nor 
predetermined interconnectivity. In particular, ad /loc networking [11] is the most 
popular type of multi-hop wireless networks because of its simplicity. An ad-hoc 
wireless network consists of a homogeneous system of radio stations connected 
by wireless links. In an ad hoc network, every station is assigned a transmission 
range. The overall range assignment determines a transmission (directed) graph 
since one station s can transmit to another station t if and only if t is within the 
transmission range of s. The transmission range of a station depends, in turn, on 
the energy power supplied to the station. In particular, the power Ps required 
by a station s to correctly transmit data to another station t must satisfy the 
inequality 



Ps 

dist(s, t)“ 



> 7 



( 1 ) 



where dist(s, t) is the Euclidean distance between s and t, a > 1 is the distance- 
power gradient, and 7 > 1 is the transmission quality parameter. In an ideal 
environment (i.e., in empty space) it holds that a = 2 but it may vary from 1 
to more than 6 depending on the environment conditions at the location of the 
network (see [14]). 

The fundamental problem underlying any phase of a dynamic resource alloca- 
tion algorithm in ad-hoc wireless networks is the following. Find a transmission 
range assignment such that (1) the corresponding transmission graph satisfies a 
given connectivity property U, and (2) the overall energy power required to de- 
ploy the assignment (according to Inequality (1)) is minimized (see for example 
[9,12]). In [6], the reader may find an exhaustive survey on the previous results 
related to the above problem. 

In this paper we address the case in which 77 is defined as follows: Given a set 
of stations and a specific source station s, the transmission graph has to contain 
a directed spanning tree rooted at s (a branching from s) of depth at most h. The 
relevance of this case is due to the fact that any transmission graph satisfying 
the above property allows the source station to perform a broadcast operation 
in at most h hops. Broadcast is a task initiated by the source station which 
transmits a message to all stations in the wireless network. This task constitutes 
a major part of real life multi-hop radio networks [1,2,9]. 



Previous results. The broadcast range assignment problem described above is 
a special case of the following optimization problem, called 7-Minimum En- 
ergy Consumption Broadcast Subgraph (in short, 7-MECBS). Given a 
weighted directed graph G = {V,E) with \V\ = n and an edge weight func- 
tion w : E ^ K+, a range assignment for G is a function r : E — >■ K'*'; the 
transmission (directed) graph induced by G and r is defined as Gr = (V,E') 
where 

E' = {(u, u) : (v,u) € E A w{v, u) < r(u)}. 

vev 
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The /i-MECBS problem is then defined as follows. Given a source node s & V , 
find a range assignment r for G such that contains a directed spanning tree 
of G rooted at s of depth at most h and cost(r) = minimized. 

The /i-MECBS problem is NP-hard and, if P ^ NP, it is not approximable 
within a sub-logarithmic factor, even when the problem is restricted to undi- 
rected graphs [10] and h = 2. 

The intractability of the general version of /i-MECBS does not necessarily 
imply the same hardness result for its restriction to wireless networks. In par- 
ticular, let us consider, for any d > I and a > 1, the family of graphs called 
(d- dimensional) wireless networks, defined as follows. A complete (undirected) 
graph G belongs to N[) if it can be embedded in a d-dimensional Euclidean space 
such that the weight of an edge is equal to the a-th power of the Euclidean dis- 
tance between the two endpoints of the edge itself. The restriction of /i-MECBS 
to graphs in N[) is then denoted by /i-MECBS[N[)]. It is clear that the previ- 
ously described broadcast range assignment problem in the ideal 2-dimensional 
environment corresponds to ft,-MECBS[N|j. 

Observe that if a = 1 (that is, the edge weights coincide with the Euclidean 
distances), then the optimal range assignment is simply obtained by assigning 
to s the distance from the node farthest from it and assigning 0 to all other 
nodes. We then have that, for any d> 1 and h > 1, ft,-MECBS[N;)] is solvable 
in polynomial time. Moreover, it has also been shown that, for any a > 1, 
ft,-MECBS[N“] is solvable in polynomial time [7]. 

It is also possible to prove that, for any d > 2, a > 1 and h = n — 1, h- 
MECBS[N[)] is iVP-hard (this version is referred to as the unbounded case). 
The proof of this result is an adaptation of the one given in [8] to prove the 
NP-hardness of computing a minimum range assignment that guarantees the 
strong connectivity of the corresponding transmission graph. This adaptation is 
described in [4]. In [3,5] it is shown that, as for the unbounded case, whenever a > 
d the MST-based algorithm proposed in [9] achieves constant approximation. 
Given a graph G G and a specified source node s, the MST-based algorithm 
first computes a minimum spanning tree T of G (observe that this computation 
does not depend on the value of a) . Subsequently, it makes T directed by rooting 
it at s. Finally, the algorithm assigns to each vertex v the maximum among the 
weights of all edges of T outgoing from v. 

Our results. Bounding the number of hops in message broadcasting on a wireless 
network is a crucial issue for the QoS of the network. We thus aim to provide 
efficient solutions for the /i-MECBS[N 2 ] problem when h is a “small” constant 
(i.e., independent from the network size). In particular, we provide the first 
polynomial-time algorithm that solves the 2 -MECBS[N 2 ] problem for any a > 
0. The algorithm use a crossed dynamic programming to get an optimal solution. 
The dynamic programming is far from being simple and requires O(n^) time to 
fill up the relative matrices. 

Then, we derive a polynomial-time approximation scheme (PTAS) that works 
for the /i-MECBS[N 2 ], for any fixed constant h> 1. For fixed h>l and e > 0, 
the PTAS has time complexity 0{n^) where fj, = G((a2“/i“/e)“ ). 
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2 A Polynomial-Time Algorithm for the 2-MECBS on 
the Plane 

In this section we describe a polynomial time algorithm for the 2-MECBS prob- 
lem on the Euclidean plane. The stations are represented by points in the Eu- 
clidean plane. Let cost(c,p) be the cost required of station c in order to cover 
station p with minimum power. We only require cost to be a positive function 
for which cost(c, ri) < cost(c, r 2 ) if dist(c, ri) < dist(c, r 2 ) holds. This also 
includes the cost function mentioned in the introduction. 

The input of the algorithm are n points in the Euclidean plane, a speci- 
fied source station s and a cost function cost , n}^ i— >■ K with the above 

properties. 

Our algorithm is based on a procedure that computes an optimal 2-broadcast 
range assignment for a fixed range of station s. Since the range of s in an optimal 
solution is defined by the farthest station / covered by the station s, we only 
have to invoke this procedure n — 1 times and take the best solution in order to 
solve the 2-MECBS problem. 

In the rest of this section, we describe the procedure for a fixed source range. 
Since the range of s is fixed, we can define V as the set of stations covered by 
s. Let V be the set of stations not covered by s. Let us rename the stations of 
E\ {s} as {I, . . . , n — 1} such that the first m = \V\ stations are those in V, and 
they are ordered in clockwise order around the source station, starting with an 
arbitrary station in V. (The source station is still denoted by s.) 

Definition 1. Let the interval [^,r] denote the set of stations i such that I < 
i < r. 



Definition 2. For an interval I, let A{I) he the minimum cost required to cover 
all the stations in I by stations from V . 

According to the above definition, the value we are looking for in our proce- 
dure is A([l,m]) -|- cost(s, /) where / is the station defining the source range. 
Consider an optimal covering of the interval [l^r] expressed by the ranges of all 
stations in V. Geometrically, this solution is represented as an arrangement A 
of disks, in which every disk represents the range of the station in its center. 
Denote by A the set of points contained in these disks. The disk of a station c 
in the arrangement is denoted by 

An alternative representation of A is to assign to each station p G V the 
station in V that reaches p. In general, there may be many ways to define such 
an assignment. We do it by the A function defined below. 

Definition 3. For i G V, let /3{i) be the last point in A on the ray p emerging 
from s in the direction ofi. Define A(i) to be the station in V whose disk contains 
P{i). If there is more than one, choose an arbitrary one. Let a denote the segment 
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Fig. 1. Point i is always between e and l3{i) 



Note that /3(i) is on the border of some disk in the arrangement, and is not 
necessarily a station. 

In order for A to be a valid assignment, it is necessary to show the following. 

Lemma 1. In the arrangement A, station i is contained in the range of sta- 
tion A(i) . 



Proof. Clearly, i G a. If s G Aj, then the entire segment a is contained in 
Aj and we are done. Otherwise, consider Fig. 1. Let e be the point in which 
the ray p enters Aj. We just have to prove that e G Ag. Let q\ and q 2 be 
the intersection points of the tangents from s to Aj. The lemma follows since 
dist(s,e) < dist(s,( 7 i) = dist(s,(72) < dist(s, j) and j G Ag. This also proves 
that the entire segment a is contained in A. □ 

If j = A(i), we say that i is X-covered by j. By using the A-assignment, 
we now establish the following lemma which states that the cost of A can be 
obtained by combining optimal solutions of smaller intervals, thus allowing the 
use of dynamic programming. 

Lemma 2. There exist stations c G V , p G V and suhintervals J\, . . . Jt of [I, r] 
such that the cost of A can be expressed as 

t 

cost(c,p) + ^ A{Jk)- 

k^l 

Towards proving the lemma, we identify some properties of the arrangement. 
Let D{c,p) be a disk of the arrangement with center c and radius dist(c,p), 
c ^ p. Let [l,r] be the smallest interval that contains all stations that are A- 
covered by c. Let I~ be the set of stations in [l,r] that are not A-covered by 
c. Let Ji, . . . , Jt be the partition of I~ into intervals such that i and j are in 
the same interval if and only if there is no station q G [i,j] which is A-covered 
by c. For 1 <k <t, define Mk to be the set of stations in V which A-cover some 
station in J^. The partitioning Ji, . . . , Jt has two key properties (whose proof 
is deferred to the full paper). 
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Proposition 1. 

(PI) For every 1 <k <t, no station in Jk is X-covered by station c. 

(P2) Mi n Mj = 0 for every I < i < j < t. 

Proof of Lemma 2. We clearly have to pay the cost of the range of c, denoted by 
cost(c,p). For the second part, consider a set Jk- Let Ak be the optimal solution 
for covering the stations in Jk- Assume the cost of Ak was strictly smaller than 
the sum of the cost of the stations in Mk- Because of Property (P2), we could 
remove the ranges of the stations in Mk from A and add the ranges of Ak- This 
new solution would be cheaper than the previous one, which is a contradiction 
to the optimality of A- □ 



Let S{[l,r],c,p) be the set of stations in interval [r,l] which are not covered 
by D(c,p)- In order to make use of Lemma 2, we have to solve the problem of 
finding the Ji, . . . ,Jt for given [l,r], c, and p- This is a kind of one-dimensional 
set covering problem: We have to find a set Af of subintervals of [l,r] such 
that any station in S{[l,r],c,p) is contained in at least one interval of Af and 
minimized. 

Let B{[l,r],c,p) be the cost of the optimal covering of the stations in 
S{[l,r],c,p)- Note that A(J) < A(J') if J C J' . This implies that the sets in Af 
can be chosen such that they do not intersect then B{[l,r],c,p) = B{[l+l,r],c,p) 
if I G D{c,p) and B{[1, r],c,p) = B{[1, r — 1], c,p) if r G D{c,p)- The general case 
comes from the fact that an optimal partitioning is composed of a first inter- 
val [l,k] (which will have k G S{[l,r],c,p)) and an optimal partitioning of the 
remaining interval [k + l,r] then 

B{[l,r],c,p) = mm {A{[l,k]) + B{[k + l,r],c,p)}- 

keS{[l,r],c,p) 

Finally, B{[l,r],c,p) = 0 for / > r or S{[l,r],c,p) = 0. 

Having B{[1, r], c,p) at hand, A([^, r]) can be computed considering the opti- 
mal partitioning for all pairs of c,p: 

A([l,r])= min B{[1 + l,r], c,p) + cost{c,p)- 

c£V,p&V I l&D(c,p) 

The tables must be filled alternatingly starting with the smallest intervals. Since 
B{[1, r],c,p) might use A{[1, r]), the latter should be computed first. The optimal 
value can be found in A([l,r]). The running time of the algorithm is O(n^). 

3 A PTAS for Any Constant h 

The set of n stations is specified by the set of points X = {x \, . . . ,Xn} in the 
Euclidean plane, where x± = s. Without loss of generality assume that the points 
are ordered by their distance from s, and in particular, x„ is the point farthest 
from s, and let R = dist(s, x„). Every range assignment induces the set of disks 
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{D{xi), . . . , D{xn)}, each with center Xi and radius r{xi). If r{xi) > 0, we say 
that the disk D{xi) belongs to the assignment. For every two points Xi,Xj € X, 
a path from Xi to Xj in the assignment is an ordered set of disks {Di, . . . , D^} 
belonging to the range assignment with centers yi, ■ ■ ■ ,yk respectively, such that 
yi = Xi, Xj € Dk and yi € Di-i for each i = 2, k. A minimum hop path 
from Xi to Xj is a path containing a minimum number of disks among all paths 
from Xi to Xj. In an ft,-broadcast assignment, the radii must be assigned such 
that for every 2 < i < n, there exists a path containing at most h disks from s 
to Xi- Let S* be an optimal solution to the problem, and denote its cost by C* . 

For every e > 0, let fc = a2“/i“/e. Define the sequence {ki }^^2 where k 2 = k 
and ki+i = k ■ kf for each i = 2, . . . ,h — 1. 

Notice that in any optimal solution, the disk around any point Xj is of radius 
dist(a:j, Xi) for some 1 < t < n. For 2 < t < n, let Di be the disk of radius 
dist(s,Xi) centered at s. A range assignment is called a principal h-Broadcast 
if it consists of one such disk Di around s, for some 2 < i < n, plus up to fc)) 
disks around other points, each of radius at least R/kh- 

Our algorithm operates as follows. For given fixed e and ft-, if e > ft“~^ then 
the algorithm returns a single disk of radius R centered at s. Otherwise, the 
algorithm examines all principal h-Broadcasts, and outputs the one attaining 
the minimal cost. 



3.1 Analysis 

We first observe that the algorithm is polynomial for fixed e and ft. For fixed 
ft > 1 and e > 0, its time complexity is O(n^) where y = 0((a2“ft“/e)“ ). 

Our approximation ratio analysis is based on the observation that the single 
disk solution, obtained by taking a single disk of radius R centered at s, yields 
a constant approximation to the optimal solution. (The proof is deferred to the 
full version of the paper.) 

Lemma 3. The single disk solution provides an approximation of ratio ft“~^, 
namely, R°^ < ft“"0 C*. 

We now prove that the cost of the solution produced by our algorithm is at 
most (l+e)C*. Notice that if e > ft““^, then the single disk solution generated by 
the algorithm attains the desired bound trivially from Lemma 3. Hence hereafter 
we assume that e < h°‘~^ . 

Consider the disks that belong to the optimal solution S* . For each such disk 
D*{x), define its level to be the number of disks in the minimum hop path from 
s to X. Define a disk D*{x) of level j, 2 < j < ft, to be large if r*{x) > R/kj, 
otherwise it is small. For uniformity, define also the disk D*{s) to be large. For 
each level j > 1, let mj be the number of large disks of level j in the optimal 
solution S*. (Always mi = 1.) Thus the large disks of level j contribute at least 
mjR°" /kf to C*, the cost of the optimal solution. As C* cannot exceed the cost 
of the single disk solution, namely i?“, we have the following. 

Proposition 2. For each level j > 2, mj < kf. 
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Also, noting that every large disk of level j > 2 has radius at least R/kh, we 
have the following. 

Proposition 3. S* contains at most large disks. 

Now, consider the range assignment S derived from S* in the following way. 
For each large disk D*{x) in S*, let f{x) be the farthest point from x of higher 
level than x for which there is a minimum hop path from x to f{x) that contains 
only small disks (other than D*{x)). For each large disk D*{x) in S*, let f{x) = 
d±st{x, f{x)) and let (p{D*{x)) be the disk of radius f{x) around x. We now 
take S to contain the disk (p{D*{x)) for every large disk D*{x) in S* . 

Since the minimum hop path from x to f{x) contains at most h — 1 disks of 
increasing levels, which are all small to their level, dist(a:, /(x)) < 

and as R/kj > h- R/kj+i by the choice of k and the assumption that e < 
we have the following. 

Proposition 4. For every large disk D*{x) in S* of level j: 

1. If‘2<j<h— 1 then f{x) < r*{x) + 2R/kj+\. 

2. If j = h then r{x) = r*(x). 



Lemma 4. The assignment S is a principal h-broadcast. 

Proof. By Proposition 3, S contains at most k^ disks, all of which (except maybe 
the one centered at s) have radius at least R/kh. It remains to argue that every 
point X G A is reachable by a path of h or fewer hops. Consider such a point x 
and suppose a minimum hop path from s to x in S* is established by the disks 
I?*, . . . , D/, where I < h. Note that the minimality of the path ensures that the 
level of each of the disks D* is exactly j. Each small disk D* in this list is now 
contained in the large disk (p{D*) G S such that D* is the closest large disk in 
S* preceding D* in the list. Therefore the number of hops in the path to x in S' 
is no greater than in S*. □ 

We thus conclude that the assignment S was checked by our algorithm. We 
now bound the cost of this assignment, denoted C. For each j > 1 and i = 
1, . . . , TOj, let Vi^j be the radii of the large disks of level j. Then C satisfies 



C < 



h-l rrij 

EE 







(2) 



We rely on the following technical fact (which can be verified, say, by looking at 
the Taylor expansion of the function (1 -|- z)“). 

Fact 1. For a > I and 0 < z < 1, (1-1- z)“ <1-1- o;z(l -I- 



Proposition 5. Irij 



kj^i ) f^j+i 
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Proof. Note that Vij > 2R/kj+\ for every i and j, so z = satisfies 

0 < 0 < 1. This allows us to use Fact 1 and get 



*0 



2R 



S+i 



„ / 2Ra / 2R 

- 1 I 1 + 



oc— 1 



2Ra f 2R 
= rr,- + 1 I Tij -I 



Q;— l^ 



< r“ - + 

— 1,3 ^ 






Kj+i \ kj+i 

The thesis follows from by observing that R > Vi j. 
Combining Inequality (2) with Proposition 5 yields that 

<5 s n-'b+iii: 

i,3 i=l i=l 



^j + 1 



kj+i 



As C* > rfp and using Proposition 2 and the definition of kj, 



h-l 



c<c* + J2(k^ ^ 



3 = 1 



o2“i?" 



4+1 



rv9^ DO; 

< C* + h ^ < C*+e-C*, 



where the last inequality is established by the choice of k and Lemma 3. We 
have thus established the following. 



Lemma 5. The algorithm yields a solution of cost at most (1 + e)C* . 



4 Conclusions and Open Problems 

In this paper we investigated the problem of computing a minimal cost range 
assignment in ad-hoc wireless networks that guarantees the broadcast operation 
from a given source station in at most h hops. We provide a polynomial-time 
algorithm for the case h = 2 and a PTAS for any constant h > 1. Nothing is 
known about the hardness of the case h > 2. We conjecture that there exists 
some constant h for which the problem is NP-hard. This is the main problem 
left open by this paper. Finally, nothing is known when h is any function of n 
(but h = n — 1). 
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Abstract. We introduce top-down deterministic transducers with ra- 
tional lookahead (transducer for short) working on infinite terms. We 
show that for such a transducer T, there exists an MSO-transduction 
T such that for any graph G, unfold{T(G)) = T {unfold (G)). Recipro- 
cally, we show that if an MSO-transduction T “preserves bisimilarity”, 
then there is a transducer T such that for any graph G, unfold{T{G)) = 
T{unfold{G)). According to this, transducers can be seen as a complete 
method of implementation of MSO-transductions that preserve bisimi- 
larity. One application is for transformations of equational systems. 



1 Introduction 

The theory of tree transducers has been widely studied since the 1970s (see e.g. 
[9]). Tree transducers are abstract machines describing relations between finite 
terms. Among the numerous known families of transducers one happens to be 
a good compromise between decidability and expressiveness requirements : the 
top-down tree transducers with regular lookahead [15]. Those transducers are 
closed by composition, and preserve the regularity of sets of terms by inverse 
image. 

One application of tree transducers is to implement relations between do- 
mains different from trees, in particular graphs. The principle is to attach a 
semantics from tuple of graphs to graphs of correct arity to each symbol and 
to use this semantic to evaluate any tree build upon those symbols. The result- 
ing object is a graph called the interpretation of the tree. In this context, tree 
transducers describe relations between graphs through the trees representing 
them. Engelfriet studied this approach [16] and as it turns out top-down tree 
transducers with regular lookahead suit particularly well in this setting. 

Top-down tree transducers have also been extended to macro tree transduc- 
ers [18] which are themselves equivalent to so-called tree-to-graph transducers 
[19]. Those devices are strictly more expressive than top-down tree transducers. 
Drewes compares them with respect to translations between algebras [13]. The 
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representation by monadically definable transformations of those transducers 
has been studied extensively (see e.g. [17,4]), however, those results cannot be 
seen as the finite case counterpart of the results presented in this paper. 

We describe in the present paper a similar theory for top-down tree trans- 
ducers, but working on infinite terms. Such infinite terms can be interpreted into 
infinite objects as investigated first by Courcelle for graphs [10,1,2,7] (technically 
the interpretation is extended to infinite terms by a limit passing argument) . The 
same need for transducers appears in this context. However, a slightly different 
point of view can be adopted : each infinite tree can be obtained as the unfolding 
of a (possibly infinite) graph. To this respect, interpreting the term is equivalent 
to solving the graph seen as an equational system. For this reason, we investigate 
how transducers of terms can be compared with transformations of graphs : this 
approach produces tools for transforming (possibly infinite) equational systems. 
This approach is extensively used in [8] . 

In this paper we introduce top-down transducers with rational lookahead 
— we say simply transducers from now — working on infinite trees. We define 
the notion of rationality by means of monadic second-order definability : a set 
of (possibly infinite) trees is rational if it is the set of tree models of some 
monadic second-order formula. A transducer is a deterministic device with a 
finite number of states that reads a (possibly infinite) input tree starting from 
the root and produces a (possibly infinite) output tree. Each transition consists 
in either consuming the input root symbol, producing an output symbol, or 
verifying that the input tree belongs to some rational set (this ability is called 
the ‘lookahead’). 

Major results concerning the finite tree case still hold for those transducers : 
we show the closure by composition of transducers and the rationality of the 
inverse image of a rational set by a transducer. However, an extra hypothesis of 
determinism of the transducer is necessary in our proof. We also investigate the 
relationships of those transducers with respect to unfolding and monadic second- 
order transductions (MSO-transductions for short) . We establish that the result 
of a transducer applied to the unfolding of a graph can also be obtained by the 
successive application of an MSO-transduction followed by an unfolding. We say 
in this case that the MSO-transduction implements the transducer. Let us note 
that such an MSO-transduction is by definition bisimilarity preserving. In fact, 
a converse to this result also holds and is the most involved proof presented in 
this work : every MSO-transduction implements a transducer provided that it 
preserves bisimilarity. For this reason transducers can be understood as the tree 
theoretic counterpart to MSO-transductions. 

Among consequences of those results are that regularity of terms (but not 
rationality of sets of terms) is preserved by transducers. More generally, term 
solutions of safe higher-order program schemes of level n are closed under appli- 
cation of transducers [20,6,5]. 

The remainder of the paper is divided as follows. In the next section we give 
the basic definitions on graphs, MSO-transductions, and transducers. In Sec- 
tion 3 we state some basic properties of deterministic transducers and show that 
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the functions computed by them can also be obtained using MSO-transductions 
followed by unfolding. In the last section we present the result that MSO- 
transductions that preserve bisimilarity of graphs can be simulated by deter- 
ministic transducers on the unfoldings of the graphs. 

2 Definitions 

An (edge-labeled) graph G over an alphabet A is a pair G = (Vq, Eq) where Vq 
is the set of vertices and Eq C Vg x ^ x he is the set of edges. A rooted graph G 
is of the form G = {Vg, Eq, re) where Vq and Eg are as before and tg G Vg is 
the root of G. A directed path in G is a sequence of vertices such that successive 
vertices u, v in this sequence are connected by an edge (u, a, v) € Eg for some 
a G E. A sequence of vertices is an undirected path if successive vertices u, v in 
this sequence are connected by an edge (m, a, v) G Eg or (v, a, u) G Eg- For two 
vertices u,v G Vg & connection of u and v is a path from m to u that does not 
contain any vertex twice. 

A (undirected) tree t is a graph such that for each two vertices u,v G Vt there 
is exactly one undirected connection between u and v. A rooted tree t is a tree 
such that for each v G Vt there is a directed path from the root rt of t to v. For 
a rooted graph G we denote by unfold(G) the rooted tree that is obtained by 
unfolding G from the root re- If we want to unfold G from a vertex v different 
from the root, then we write this as unfold(G, w). 

For a ranked alphabet E and f G E we write |/| for the arity of /. By |.7^|max 
we denote the maximal rank of a symbol in E . We represent terms over E as 
rooted trees over the alphabet Ej: = .7^U{1,... , |.?^|max}- A term over E ^ 
rooted tree t over Ej^ such that 

~ there is exactly one edge starting from r* and this edge is labeled with a 
letter from E, 

— if there is an edge {v, f, v') G Et for some f G E, then this is the only edge 
starting from v and there are exactly |/| edges starting from v' labeled by 
1 , . . . , I / 1 , respectively, and 

— if there is an edge {v,£,v') G Et with £ G {1, . ■ . , l^lmax}? then there is an 
edge labeled by a letter from E starting from v' . 

The set of all .7^-terms is denoted by T(E). 

We say that a rooted graph G represents a term iff unfold(G) is a term. We 
are only interested in graphs representing terms. Therefore, in the following an 
.7^-graph always means a rooted graph over Ej: that represents a term. So, the 
iF-trees are exactly the iF-terms. For two iF-graphs G and G' we write G ~ G' 
if G and G' represent the same term, i.e., if unfold(G) = unfold(G'). Since E- 
graphs are deterministic they have the same unfolding iff they are bisimilar. 
Hence, on .F-graphs the relation ~ corresponds to bisimulation equivalence (cf. 
[ 22 ]). 

According to the above definition of terms there is a natural partition of the 
vertices of an F-graph into those vertices being the source of an F-edge and those 
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vertices being the source of edges labeled with natural numbers. We denote the 
former of these two sets by Vq = {v G Vq \ G Vq ■ {v, /, u) G Eq}- 

Since the vertices that are not from Vq are those that have to be inserted when 
passing from the usual representation of terms to our representation we call them 
auxiliary vertices. The vertices from Vq are called main vertices. 

MSO-Transductions. For the remainder of this article we fix two ranked alpha- 
bets T, T' and write i7, E' instead of Ej: and Ej^t. We assume the standard 
syntax and semantics of MSO logic over graphs, i.e., quantification over individ- 
ual vertices (first-order quantification) and quantification over sets of vertices 
(monadic second-order quantification). For an introduction to MSO logic we 
refer the reader to [14]. An MSO-transduction T is of the form 

^ T E ^ (^, y))ie[l,n]7 ^) 

with MSO-formulas 4>a,i,j{x,y) and pi{x,y) over the signature {Ea)aeSi where 
each Ea is a binary symbol interpreted as the set of a-labeled edges. 

In order to obtain a unique root we require that for all .7^-graphs G and all 

V £ Vg there is at most one u £ Vq and i G [l,n] such that G ^ pi{v,u). For 
each .7^-graph G we define the graph T{G) over E' that is obtained by applying 
T to G as follows. If there are no m G Vg and i G [1, n] with G |= Pi{ra, u), then 
T(G) is undefined. Otherwise, 

— Vt(g) = ^ X [1) 

— for a £ E' and i,j £ [l,n] there is an edge ((r>, f), a, (t6, j)) in Ex(g) iff 
G 1= 4)a,i,j{v,u), and 

“ ^r(G) = {u,i) for the unique u and i with G ^ pi{rQ,u). 

Note that our definition of MSO-transduction slightly differs from the standard 
definition (cf. [11]). We are interested in rooted graphs and thus we need the 
formulas Pi{x, y) to define the roots of the transformed graph. Since furthermore, 
our main interest is on the unfolding of the transformed graphs, we do not need a 
formula restricting the domain of T. We sometimes write T{G, v) for some vertex 

V of G to denote the application of T to the graph G with its root changed to v. 

The definition of an MSO-transduction does not enforce that T(G) represents 
an E'-term when applied to a graph G representing an .7^-term. Furthermore, 
we are interested in simulating MSO-transductions by transducers working on 
terms. Thus, we want to consider MSO-transductions that, when applied to two 
.7^-graphs representing the same term, yield two .T^'-graphs again representing 
the same term. This is captured by the following definition. 

We call an MSO-transduction T bisimilarity preserving iff 

— T{G) (if it is defined) is an .T^'-graph for each .7^-graph G and 

— for all iF-graphs G and G', if G ^ G', then T(G) is defined iff T(G') is 
defined and T(G) ~ T(G'). 

So, bisimilarity preserving MSO-transductions transform .7^-graphs into E'- 
graphs and preserve bisimulation equivalence of graphs. In particular, because of 
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the first condition, all the formulas y) in a bisimilarity preserving MSO- 

transduction T must be deterministic in the following sense. For all iF-graphs G 
and V £ Vg there is at most one u £ Vq such that G |= (j)a^ij{v,u). We call an 
MSO-transduction with this property deterministic. 

Transducers with Rational Lookahead. A toj^down tree transducer with rational 
lookahead (transducer for short) is a tuple T = {Q, iF,iF' ,qo, A) with: 

— Q a, finite set of states, 

— qo £ Q the initial state, and 

— A a finite set of rules of one of the following forms: 

(production rule): q{x) — >■ 5 (< 7 i(a;), ..., < 7 |g| (a;)) with g G if', x a variable, 
and gi, . . . ,q\g\ £ Q. 

(consumption rule): q{f{x\,...,x\f\)) -£ q'{xi) with f £ q,q' £ Q, and 
xi, . . . , x\f\ variables. 

(lookahead rule): q{x £ L) ^ q'{x) with L a rational set of jF-terms 
(called lookahead set), q,q' G Q, and x a variable. 

Each rule of A can be interpreted as a rewrite rule. A lookahead rule q{x G A) — >■ 
q'{x) can only be applied to q{t) if t is a term from L. Hence the lookahead rules 
allow to ‘inspect’ the input tree and collect some information about it. 

A transducer T = ^q^.A) is deterministic if for each state q £ Q 

and each A'-term t the set of rules that can be applied to q{t) 

— either consists of lookahead rules with pairwise disjoint lookahead sets, or 

— contains exactly one production rule, or 

— contains exactly one consumption rule. 

According to the above definition we will speak of production states, consump- 
tion states, and lookahead states. 

The result T{t) of applying T to an jF-term t is the term that is obtained 
from t by applying the rewrite rules of T ‘to the limit’, starting from qo{t). In the 
formal definition of T(t) we have to be careful because we cannot simply define 
the image of an infinite term as the limit of a sequence of images of finite terms. 
Because of the lookahead the functions computed by deterministic transducers 
need not to be continuous. 

Let T = {Q,T ,qQ, A) be a deterministic transducer and let be the 
ranked alphabet T' augmented by a new symbol _L of rank 0. By induction on 
n we define for each state q £ Q and each infinite term t £ 'T(iF) the term 
Sn{qG) G as So{q,t) = _L, 

-if q{x) -£ g{qi{x),... ,q\g\{x)) £ A, then Sn+i{q,t) = g{Sn{qi,t), . . . , 
^n{q\g \ ) t)), 

-if g(/(a;i, ...,a;|/|)) q'{xi) £ A, then 6n+i{q,t) = 5n{q' ,U) for t = 

fih,--- ,t\f\), 

— if q{x G A) — >■ q'{x) £ A and t £ L, then Sn+i{q,t) = Sn{q',t). 
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If no transition of the transducer can be applied or if the right hand side in the 
definition of Sn+i{q,t) is undefined, then is undefined. 

First note that in each situation at most one rule can be applied because 
of the determinism of the transducer. Therefore, if Sn{q,t) is defined, then it is 
unique. If we consider the complete partial order C on .7^^-terms with F C t if 
t' is obtained from t by replacing subterms with _L, then one can easily show 
by induction on n that the sequence {6„{q,t))neti is either increasing (w.r.t C) 
or undefined from a certain point onward. In the former case we let Tq(t) be 
the limit of this sequence and in the latter case Tg{t) is undefined. Now we can 
define T{t) = 

3 First Results about Transducers 

In this section, we establish that the inverse image of a rational set by a trans- 
ducer is also rational (Lemma 2). We also show that transducers can be imple- 
mented by MSO-transductions (Theorem 1). 

According to the current definition, it may happen that for some .7^-term t and 
some state q the term Tq{t) still contains the symbol T. This phenomenon can be 
a technical burden for the following proofs. We start this section by normalizing 
transducers in such a way that this situation does not occur anymore. 

Form^ly, let T be a transducer from iF-terms to iF'-terms of states Q, we 
say that T is normalized if for any state q € Q and any iF-term t, Tq{t) yf T. 

For T to be normalized it is sufficient but not necessary to have a produc- 
tion in each of its cycles. Consider for instance a transducer that would remove 
all the occurrences of a given symbol — say a of arity 1 — provided that this 
symbol has only a finite number of occurrences in the term. This transducer 
contains cycles without production since an unbounded number of a can be re- 
moved without producing any output symbol. However, by definition this cannot 
happen infinitely often. 

Lemma 1. Let T he a transducer from T to T' and T' he a new symbol of arity 
0. There exists effectively a transducer T' from iF-terms to (T' U {l-'})-terms 
such that T' = hoT where h replaces every occurrence of the symbol T in a term 
by T'. 

An important property of deterministic transducers is that their domain is ra- 
tional. More precisely, as stated in Lemma 2, the inverse image of a rational 
language by a normalized deterministic transducer is rational. 

Lemma 2. Let T he a deterministic transducer from T -terms to T' -terms. Lf L 
is a rational subset ofT{T'), then the set T~^{L) is also rational. 

Let us notice that in this proof, the determinism of the transducer is explicitly 
needed and we don’t know if the result remains true without this restriction. 
This was not the case for transducers of finite trees. It also follows directly from 
this lemma that the domain of a transducer is rational. 

Finally, we aim at establishing Theorem 1 which expresses how a transducer 
can be simulated by an MSO-transduction before unfolding. First of all we need 
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a result relating lookaheads with MSO-logic. It is a particular case of Courcelle’s 
result in [12]. 

Lemma 3. For any rational set of T-terms L, there exists an MSO-formula 
4>{x) such that for any T -graph G and any vertex v G Vq , unfold(G, v) G 

We can now state the main result of this section. 

Theorem 1. Let T he a normalized deterministic transducer from T-terms to 
if' -terms. There exists effectively an MS 0-transduction T such that for any T- 
graph G and any vertex v G Vq , T{G,v) is defined Zj(f T(unfold(G, u)) is defined, 
and in this case unfold(T(G, u)) = T(unfold(G, u)). 

The proof consists in using one copy of the graph for each state of the transducer. 
Then the MSO-formulas put correctly the edges. The Lemmas 2 and 3 combined 
allow the implementation of lookahead rules. 

4 Prom MSO-Transductions to Transducers 

The goal of this section is to show that bisimilarity preserving MSO-transduc- 
tions can be simulated by deterministic transducers on the unfoldings of graphs. 
In a first step we use the fact that T is bisimilarity preserving to obtain some 
kind of normal form for T. 

Since rooted graphs are bisimilar to their unfoldings the following simple 
remark allows us to consider MSO-transductions operating on trees instead 
of graphs. Formally, this means that if T is a bisimilarity preserving MSO- 
transduction, then unfold(T(G)) = unfold (T(unfold(G))) for every .7^-graph G. 

To simulate an MSO-transduction by a deterministic transducer we would 
like the MSO-transduction to ‘respect’ the type (main or auxiliary) of the vertices 
since transducers only work on main vertices. Furthermore, transducers work in 
a top-down fashion. Thus, we want to normalize the MSO-transduction in such 
a way that the new edges go ‘downward’ if the transduction is applied to a 
tree. Under this assumption a deterministic transducer can construct the edges 
defined by the MSO-transduction by going down the term that is represented 
by t and using its rational lookahead. The following definition formally captures 
the properties we need to simulate an MSO-transduction by a deterministic 
transducer. 

Definition 1. An MSO-transduction T={E, E' , {4>a,i,j{x,y))a,ij, (pi{x,y))i,n) 
is in top-down normal form iff for all T -trees t 

(a) rT{t) = (rtO), 

(h) ift 1= 4>g,i^,^2(vi,V2) and t ^ (t>tM,iffv 2 ,vfij for g G T' , £ G {I, . . . , |.?^'|max}, 
and , *3 G [1 , , then v\ Qv^, and 

(c) for every E-graph G, if G ^ (f>g^ij{u,v) for g G T' , then u G Vq and 
v ^ Vq > if G \= 4>iij{u,v) for £ G {1, ■ • . , \E'\max}, then u ^ Vq and 
vGV§. 

The following lemma states that we can always ensure condition (a). 
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Lemma 4. For each MS 0-transduction T there exists an MSO-transduction T' 
such that = (raO) o.'nd unfold(T(G)) = unfold(T'(G)) for each rooted 

graph G. 

The most intricate thing is to establish condition (b) of Definition 1. First, we 
construct for a given tree t a new graph t on which the MSO-transduction must 
satisfy condition (b). 

Let t be an iF-tree. We can assume that Vt C E* by identifying each element 
V from Vt with the labeling of the unique path leading from rt to v. We define 
the iF-graph t by 

— V-^ = {(mi, . . . ,Un) I n > 0, ui,. . . ,Un G Vt, and Ui-i % Ui for all i G 
[2,n]}. 

— The edges in Ep are those of the form {u\, . . . ,Un) — > (u-i, • ■ • ,Uno) and 
(mi, . . . ,u„,va,v) A {ui, . . . ,u„,va). 

Figure 1 shows a part of this construction for a term built from a single binary 
symbol /. The underlined vertices correspond to the vertices of the original term. 



ifhs) (/!,/,£) 




{/L/2> (/1> 

/j /j 



(£> (/,e> (/2,e) {/2,/,e) 

A ^ n ^ 
if) ir2j) 

^ y 

im (/ 2 ./ 1 ) 



Fig. 1. A part of tf 



For u G Vp we denote by A(m) the last element in the sequence u, i.e., if 
u = {ui, . . . ,Um), then \{u) = Um- The following properties are easy to derive 
from the definition of t. 

Lemma 5. For each T-tree t and each vertex v G Vy; 

(i) t is a (undirected) tree. 

(ii) If \{v) = e, then there are no edges with target v in t. Otherwise, v is the 
target of exactly two edges in t that have the same label as the only edge 
with target A(v) in t. 

(Hi) V is the source of an a-edge in t iff X{v) is the source of an a-edge in t. 
(iv) unfold(t, v) = unfold(t, (u)) for each T-tree t and vertex v oft. 

An important property of t is that two vertices u and u' with A(m) = A(m') are 
indistinguishable, i.e., there is an automorphism of t that maps A(u) to A(m'). 
This enforces a certain behavior of deterministic MSO-transductions on t that 
corresponds to the second property of the top-down normal form. 
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Lemma 6. Let T he a deterministic MSO-transduction, t an T-tree, and u,v € 
Vf Vt\= <Pa,i,j{u,v), then A(u) C A(v). 

As Lemma 6 shows, property (b) of Definition 1 holds on t for MSO-transductions 
preserving bisimulation equivalence. To transfer this property to t itself we will 
make use of the concept of tree-like structures (cf. [24,3]). 

For a graph G the tree-like structure G* = (Vg, son, clone, if q) is the struc- 
ture over the universe Vq with the relations son, clone, and Eq defined by 
son = {{w,wv) I w G Vq, V € Vq}, clone = {wvv \ w € Vq, v € Vq}, and 
Eq = {{wv, a,wu) I w € Vq, (v,a,u) € Eq}. The crucial point of the tree-like 
structure is that monadic second-order properties can be expressed as monadic 
second order properties of the original structure. 

Theorem 2 (cf. [3]). For each MSO-sentence (j) there exists an MSO-sentence 
4>* such that G* \= <p ^ G \= (p* for all T -graphs G. 

The two structures t and t* are closely related, t can be interpreted in t*. This 
can be used to show a modification of the above result for t. 

Lemma 7. For each MSO-formula <p{xi, . . . ,Xm) there exists an MSO-formula 
4>*{xi, . . . , Xm) such that for all T -trees t: 

(i) Ift\= 4>{ui,... ,Um), then t |= </>*(A(ui), . . . ,X{um)) for all ui,... ,Um G 

(ii) Ift 1= 4>*{ui , . . . ,Um), for ui, . . . ,Um G Vt, then there are ui, . . . ,Um G Vp 
with \{ui) = Ui (1 < i < m) and t \= 4>{ui, . . . , Um) ■ 

Now we can establish property (b) of Definition 1. 

Lemma 8. For each bisimilarity preserving MSO-transduction T there exists an 
MSO-transduction T' satisfying fa) and (h) of Definition 1 with unfold(T(t)) = 
unfold(T'(t)) for all E-trees t. 

As a last step in the normalization we have to ensure (c) of Definition 1 . Since 
the MSO-transduction might use copies of auxiliary vertices as targets of E- 
edges and main vertices as targets of other .7^-edges we cannot simply redirect 
the latter ones to main vertices within the same copy. For this reason we have 
to introduce new copies of the original term and redirect edges with wrong type 
of source vertex or target vertex to these new copies. 

Lemma 9. For each bisimilarity preserving MSO-transduction T there ex- 
ists an MSO-transduction T' in top-down normal form with unfold(T(t)) = 
unfold(r'(f)) for all E-trees t. 

Having established the top-down normal form the next goal is to simulate 
MSO-transductions in this normal form by deterministic transducers. We make 
use of the well-known equivalence between MSO logic and Rabin automata on 
infinite trees (cf. [23]). This allows us to pass from the formulas defining the 
edges in the MS transduction to equivalent automata accepting infinite terms 
with appropriate markings coding the assignment for the free first-order variables 
in the formulas. We start by defining marked terms and automata running on 
them. Due to the lack of space we do not give detailed definitions and assume 
familiarity with the theory of automata on infinite trees (cf. [23]). 
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For a set 0, a 6*-marked ^-term (t, /r) consists of an ^-term t and a marking 
function ^ : 6> — >■ Vj. Let T{T, 0) be the set of all ©-marked ^-terms. For a set 
L C T{T^0) of marked terms we define L\j^ = {t \ G L}. 

We fix an MSO-transduction T = {S , S' ,{<pa^ij{x,y))a,ij,{pi{x,y))i,n) in 
top-down normal form that was obtained from a bisimilarity preserving MSO- 
transduction as described in the previous subsection. For all g G T' , all i G [1, n], 
and all j G define 



'4’g,i,j{x,xi,... ,x\g\) =3y \J \(j}g^i^rnix,y) A f\ \ , 

mG[l,n] \ £g{ 1,... ,|g|} / 

where j( denotes the ^th component of j, i.e., j = (ji, • . • ,jk)- For g G T' we 
define 0g = {g,l, . . . , |(;|}. For {t, p) G T{T, 0g) we write by abuse of notation 
{t,p) ^ V’s.j.j iff t h ■ ,K\9\))- Note that T being in top- 

down normal form implies that p{g) , p{l) , . . . , p{\g\) G if (t,/x) \= '4’g,i,j 
by Definition 1 (c). This enables us to adapt the usual framework of automata 
running on infinite trees (cf. [23]) to our representation of marked terms. 

Every MSO-formula can be translated into a nondeterministic Rabin tree 
automaton accepting precisely the models of the formula [21]. Hence, for all g G 
IF', all i G [l,n], and all j G [l,n]l®l there exists a nondeterministic Rabin tree 
automaton Ag^ij = {Qg,i,j,^ x ^Q'gi Ag^ij, Hg^ij) that accepts exactly 
the ©g-marked .F-terms {t, p) with (t, p) ]= We assume that the state sets 

of these automata are pairwise disjoint. We denote the union of the automata 
Ag^ij by the automaton A = (Q, ^ 2®9), Qin, A, Q), where Q, Qi„, A, 

and f2 are obtained by taking the union of the respective components of the 
automata Ag^i^j. 

The automaton A is the main ingredient in the construction of the trans- 
ducer. The idea is to keep track of the states of A that could have been reached 
on the input term and to use the rational lookahead to decide which automaton 
Ag^ij to use to construct the next edge. We illustrate the work of the transducer 
with a simple example. 

Consider the part of a marked term (t, p) depicted in the upper left box 
of Figure 2. The labels g, 1, and 2 on the vertices are the marks. The states 
qg,g,qi,g 2 are part of an accepting run of the automaton Ag^ij^j^ F’'' some 
i, ji, j 2 . In steps (1) to (5) the work of the transducer is illustrated. 

In (1) the transducer is in state (i,C). The i indicates in which copy intro- 
duced by the MSO-transduction the transducer currently is. The set of states 
that could have been reached by A at this point are stored in C. We assume 
that qg G C and using its rational lookahead the transducer can check that the 
term with the marking depicted in the upper left box is accepted from Ag^ij^j^ 
with qg as initial state. 

In step (2) the transducer applies a production rule creating the (/-edge, 
the 1-edge, and the 2-edge. Now it has to consume the part of the term until 
reaching the vertex marked with 1 in the left hand side copy and likewise in the 
right hand side copy for the vertex marked with 2. To that aim it goes to the 
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Fig. 2. Illustration of the transducer constructing an edge 



states {ji,C,qg,l) and (j 2 ,C', 9g,2) in the two corresponding copies of t. The 
last component of the states indicates for which mark the transducer is waiting. 

In step (2) the transducer has reached the two corresponding vertices. Note 
that in step (3) two independent rewritings of the transducer have been applied 
in parallel and similarly in step (4). In all these steps the transducer implicitly 
uses its lookahead to decide in which direction to proceed in the term. In each 
consumption step the sets storing the reachable states of A are updated. Having 
reached the desired vertices the transducer switches back to the mode where it 
checks which edge to construct next. The formal realization of this idea is rather 
technical and is omitted here. 

Lemma 10. There exists a deterministic transducer T such t/iat unfold (T(t)) = 
T{t) for each term t. 

Now we can prove the main result of this section. 

Theorem 3. For each bisimilarity preserving MS 0-transduction T there exists 
a deterministic transducer T such that unfold(T(G)) = T(unfold(G)) for each 
T-graph G. 

Proof. Let T be a bisimilarity preserving MSO-transduction. By Lemma 9 there 
is an MSO-transduction T' in top-down normal form such that unfold (T(t)) = 
unfold(T'(t)) for all iF-trees t. According to Lemma 10 there is a deterministic 
transducer T such that unfold(T'(f)) = T{t) for each .7^-tree t. Let G be an 
.7^-graph and let ta = unfold(G). Then we get unfold(T(G)) = unfold(T(tG)) = 
unfold(r'(tG)) = T{tG). □ 
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Abstract. An automatic structure A is one whose domain A and atomic 
relations are finite automaton (FA) recognisable. A structure isomorphic 
to A is called automatically presentable. Suppose R is an FA recognis- 
able relation on A. This paper concerns questions of the following type. 
For which automatic presentations of A is (the image of) R also FA 
recognisable? To this end we say that a relation R is intrinsically regular 
in a structure A if it is FA recognisable in every automatic presentation 
of the structure. For example, in every automatic structure all relations 
definable in first order logic are intrinsically regular. We characterise the 
intrinsically regular relations of some automatic fragments of arithmetic 
in the first order logic extended with quantifiers 3°° interpreted as ‘there 
exists infinitely many’, and 3 ^) interpreted as ‘there exists a multiple of 
i many’. 



1 Introduction 

This paper investigates the relationship between regularity, that is FA recognis- 
ability, and definability in automatic structures. Roots of this topic go back to 
the results of Biichi and Elgot in the 1960’s who proved the equivalence between 
regularity and weak monadic second order logic. A recasting of this result says 
that (the coding of) a relation is regular if and only if the relation it is first order 
definable in the structure (IN, 3-, |fe), where 3- is the addition and n\k'm means 
that n is a power of k and n divides m. Intimately related is the work of Cob- 
ham, Semenov, Muchnik, Bruyere and others that investigates the relationship 
between regular relations of (coded) natural numbers and definability in certain 
fragments of arithmetic; see [2] for a good exposition. This paper continues and 
complements these lines of research by initiating the study of the relationship 
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between regularity and definability in the general setting of arbitrary automatic 
structures. 

Assume one has a structure A that can be described by means of finite 
automata. This is formalised in Definition 2.3 that says that there is an encoding 
of the elements of the structure under which the domain A of the structure and its 
atomic relations are all regular. Such a structure is called automatic. In this case 
we say that the coded structure is an automatic presentation of A. Automatic 
presentations of A can be regarded as finite automata implementations of the 
structure A. For instance, if fc > 1, then a least-significant-digit-first base k 
encoding of the natural numbers gives rise to automatic presentations of (IN, S), 
(IN,-I-) and (IN, -b, |fc). Now assume that R C A™ is a relation, not 
necessarily in the language of A. For example, R may be the reachability relation 
if ^ is a graph, or R may be the dependency relation if ^ is a group. It may well 
be the case that in one automatic presentation of A the relation R is recognised 
by a finite automaton, and in another automatic presentation it is not. Thus, 
automata-theoretic properties of the relation R are dependent on the automata 
that describe A. Our goal is to study those relations in A that are regular under 
all automatic presentations of A, and to understand which structures ensure a 
relation is regular in all automatic presentations and which do not. Formally, we 
introduce the following definition: 

Definition 1.1. See [1]. A relation R is intrinsically regular in an automatic 
structure A if for every automatic structure B isomorphic to A the image of 
the relation R in B is regular. Denote by IR{A) the set of intrinsically regular 
relations in A. 

Thus the intrinsically regular relations in A are those for which regularity is 
invariant under all automatic presentations of A. A natural class of intrinsically 
regular relations is the class of relations definable in the first order logic. We 
now single out this class of relations in the following definition: 

Definition 1.2. A relation R is first order (FO) definable in A if there exists 
a first order formula 4>{x, c) with parameters c from A such that R = {d \ A\= 
(j){d,c)}. Denote by FO{A) the set of all first order definable relations in A. 

A fundamental result of automatic structures is stated as follows. 

Fact 1.3. Let A be an automatic structure. There exists an algorithm that from 
a FO definition (f in A of a relation R produces an automaton recognising R. In 
particular, FO{A) C IR{A). 

A proof may be found in [6] or [3]; in this paper we will use this fact without 
explicitly referencing it. Naturally, one asks whether the converse holds. It turns 
out that although this is sufficient for some structures, for instance (1N,-|-) and 
(IN, -b, \m), in general it is not. 

Extend the FO predicate logic with quantifiers and where i G IN, 
whose interpretations are as follows. The formula 3°°x(j){x) means there are 
infinitely many x such that 4>{x) holds, and the formula x (f>{x) means that 
there are exactly n elements x such that 4>{x) holds and n is a multiple of i. 
Denote the logic by Say that a relation R is definable in a 
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structure A if there is a FO°°’™°‘^-formula (j){x,a)j where d is a finite tuple of 
elements, such that R = {c \ A \= (j){c,a)}- Denote by the set of 

relations that are definable in A. Then Fact 1.3 can be extended as 

follows. 

Theorem 3.2. See [3]. ^ Let A be an automatic structure. There exists an algo- 
rithm that from a definition <f of a relation R produces an automaton 

recognising R. In particular, 

FO°°.niod(^) C IR(_4). 

Consequently, there is a neat characterisation of the intrinsically regular relations 
of (IN, <) in terms of FO“’“°‘^: 

Theorem 3.3. 



IR(IN, <) = FO“’“°‘^(IN, <) = FO(IN, <, M^, M^, ...). 

In order to show that a particular relation is intrinsically regular in a given auto- 
matic structure, one needs to provide a mechanism for extracting an automaton 
recognising the relation from automatic presentations of the structure. A perfect 
illustration of this is the subset like construction proof of Theorem 3.2. In order 
to show that a particular relation is not intrinsically regular, one needs to con- 
struct automata that, on the one hand, present the structure; and on the other, 
preclude the existence of automata recognising the given relation. The following 
theorem shows that the unary relations are not intrinsically regular for the 
structure (IN, S'). 

Theorem 4.1. For every k > 2, there is an automatic presentation of (IN, S) 
in which the image of the set is not regular. 

Consequently we have the following partial result. 

Corollary 4.2. For i? C IN, 

R G IR(IN, S) if and only if R G FO°°’“°‘^(IN, S). 

Theorem 4. 1 and its proof may be applied to construct automatic structures with 
pathological properties. The first application is concerned with the reachability 
problem in automatic graphs. It is known that the reachability problem for 
automatic graphs is not decidable, see [3] . The underlying reason for this is that 
such automatic graphs necessarily have infinitely many components. In fact, the 
reachability problem is decidable if the given graph is automatic and has finitely 
many components. A natural question is whether or not the reachability relation 
for automatic graphs with finitely many components can be recognised by finite 
automata. This is answered in the following corollary: 

Corollary 4.3. There exists an automatic presentation of a graph with exactly 
two connected components each isomorphic to (IN, S) in which the reachability 
relation is not regular. 

^ In a personal communication with the first author, A. Blumensath has mentioned 
having obtained this result. 
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The second application of Theorem 4.1 is on the structure where is 

the set of all integers and S is the successor function. A cut is a set of the form 
{x G Z \ X > n\, where n G ^ is fixed. In all previously known automatic 
presentations of {2Z, S) each cut is a regular set. The corollary below states the 
existence of a counterexample: 

Corollary 4.4. There exists an automatic presentation of {Zi, S) in which no 
cut is regular. 

Finally, we mention that one of the central topic in modern computable model 
theory, first initiated by Ash and Nerode in [1], is concerned with understanding 
the relationship between definability and computability, see [5, Chapter 3] for the 
current state of the area. For a computable structure A, that is one whose atomic 
diagram is a computable set, a relation i? is intrinsically recursively enumerable if 
in all computable isomorphic copies of A the relation R is recursively enumerable. 
In [I] Ash and Nerode show that under some natural conditions put on A, the 
relation R is intrinsically recursively enumerable if and only if it is definable as 
an effective disjunction of existential formulas. One may therefore regard the 
topic of this paper as a refined version of the Ash-Nerode program in which the 
class of automatic structures is considered rather than the class of all computable 
structures. 

Question 1.4. Characterise the intrinsically regular relations in A as those 
definable in a suitable logic of A. 

The results of this paper suggest that the logic is 

The rest of the paper is organised as follows. The next section contains automata 
preliminaries including the definition of an automatic structure and a descrip- 
tion of simple properties of intrinsically regular relations. The remaining sections 
contain proofs of some of the results stated in the introduction. Due to space con- 
straints some of the proofs are replaced by sketches or completely omitted. The 
complete proofs can be found in the full version of this paper which is available 
as a technical report of the Centre for Discrete Mathematics and Theoretical 
Computer Science in Auckland. 



2 Automata Preliminaries 

A thorough introduction to automatic structures can be found in [3] and [6]. 
In this section, familiarity with the basics of finite automata theory is assumed 
though for completeness and to fix notations, the necessary definitions are in- 
cluded here. A finite automaton A over an alphabet H is a tuple {S,l,A,F), 
where S' is a finite set of states, /. G S is the initial state, AcSxSxS is the 
transition table and F C S is the set of final states. A computation of A on a 
word a\a 2 • . . cr„ {ui G S) is a sequence of states, say qo,qi, . . . , q„, such that 
qci = i and {qt, (ji+i) G A for all i G {0, 1, . . . , n — 1}. If (/„ G Fi then the 
computation is successful and we say that automaton A accepts the word. The 
language accepted by the automaton A is the set of all words accepted by A. In 
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general, D C S* is finite automaton recognisable, or regular, if D is the language 
accepted by a finite automaton A. 

Classically finite automata recognise sets of words. The following definitions 
extends recognisability to relations of arity n, called synchronous n-tape au- 
tomata. Informally a synchronous n-tape automaton can be thought of as a 
one-way Turing machine with n input tapes. Each tape is regarded as semi- 
infinite having written on it a word in the alphabet S followed by an infinite 
succession of blanks, o symbols. The automaton starts in the initial state, reads 
simultaneously the first symbol of each tape, changes state, reads simultaneously 
the second symbol of each tape, changes state, etc., until it reads a blank on each 
tape. The automaton then stops and accepts the n-tuple of words if it is in a 
final state. The set of all n-tuples accepted by the automaton is the relation 
recognised by the automaton. Here is a formalization. Let he E U {o}, where 

E. 

Definition 2.1. Write E(^ for E U {o} where o is a symbol not in E. The 
convolution of a tuple {wi, . . . , ru„) € E*" is the string . . . , Wn) of length 

maxi Iruil over alphabet (T'o)" defined as follows. Its k’th symbol is (ui, . . . ,ct„) 
where Gi is the k’th symbol of Wi if k < |wi| and o otherwise. 

The convolution of a relation R C if*” is the relation ®R C (ifo)”* formed 
as the set of convolutions of all the tuples in R. That is i^iR = {C)w | w G R}. 

Definition 2.2. An n-tape automaton on E is a finite automaton over the 
alphabet (E^)". An n-ary relation R C A*" is finite automaton recognisable or 
regular if its convolution is recognisable by an n-tape automaton. 

We now relate n-tape automata to structures. A structure A consists of a set 
A called the domain and some relations and operations on A. We may assume 
that A only contains relational predicates as the operations can be replaced with 
their graphs. We write A = {A, Rf, . . . , R^, . . .) where Rf is an n^-ary relation 
on A. The relation Ri are sometimes called basic or atomic relations. We assume 
that the function i — >■ n^ is always a computable one. 

Definition 2.3. A structure A is automatic over E if its domain A C E* is 
finite automata recognisable, and there is an algorithm that for each i produces 
a finite automaton recognising the relation Rf C A*”* . An isomorphism from 
a structure B to an automatic structure A is an automatic presentation of B in 
which case B is called automatically presentable {over E). A structure is called 
automatic if it is automatic over some alphabet. 

Consider the word structure {{0,1}* , L, R, where for all strings x,y G 

{0, 1}* we have L{x) = xO, R{x) = xl, E{x,y) iff |x| = |y|, and ^ is the lex- 
icographical order. It is automatic over E. The configuration graphs of Turing 
machines are examples of automatic structures. Write IM for the set of natu- 
ral numbers including 0. Examples of automatically presentable structures are 
(IM, -b), (IN, <), (IN, S), the group {ZZ, -b), the order on the rationals (Q, <), and 
the Boolean algebra of finite or co-finite subsets of IN. 

Thus an automatic structure is one that is explicitly given by finite au- 
tomata that recognise the domain and the basic relations of the structure. An 
automatically presentable structure is one that is isomorphic to some automatic 
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structure. Informally, automatically presentable structures are those that have 
finite automata implementations. The same structure may have different (indeed 
infinitely many) automatic presentations. One of our goals is to understand the 
relationships between different automatic presentations of a given structure and 
understand how the automata-theoretic properties of relations of this structure 
change when one varies its automatic presentation. We illustrate the introduced 
concepts with two examples. The first concerns the standard model of Presburger 
Arithmetic (IM, +). 

Example 2 . 4 . For each to > 1 consider the presentation Am of IN over the 
alphabet Um = {0) . . . , to — 1}. Here the natural number n G IN is represented 
in Am as its shortest the least-significant-digit-first base m-representation. The 
structure (Am,+m) is automatic and is isomorphic to (IN, -b). Hence, these are 
automatic presentations of (1N,-|-). Take any n-ary relation i? in IN. Assume 
that i? is intrinsically regular. Then the image of i? in (Am, +m) is regular. 
The well-known Cobham-Semenov Theorem, see [2] , states that if both and 
are regular for multiplicatively independent i and j, then R is definable in 
(IN, +). Thus IR(IN, +) = FO(IN, +). 

The second example concerns an extension of (IN, -b). 

Example 2 . 5 . Let \m, for to > 2, be the binary relation where x\mU if and 
only if a: is a power of to and x divides y. Then the structure (IN, -b, \m) has an 
automatic presentation Am = (Am,+m,Dm)- So if i? C IN™ is intrinsically 
regular for (lN,-b, |m), then its image is regular in Am- But a central 

result of automatic structures is that first order definability in the structure 
Am is equivalent to FA recognisability, see for instance [6]. Hence is first 
order definable in Am, and so R is first order definable in (IN, -b, |m)- Thus 
IR(IN,+,U) =FO(IN,+,U). 

3 Intrinsically Regular Relations in (IN, <) 

The linearly ordered set (IN, <) has automatic presentations. For example, au- 
tomatic presentations of (IN, -b) are also automatic presentations of (IN, <). In 
this section we study intrinsically regular relations of this structure. Somewhat 
surprisingly we exhibit intrinsically regular relations of the structure (IN, <) that 
are not definable. We remind the reader that the only first order definable unary 
relations of (IN, <) are finite or co-finite, [4, Theorem 32A]. 

Let M* C IN be the set of all positions in IN that are multiples of i. Then 
these sets are not definable in (IN, <), but are intrinsically regular: 

Theorem 3.1. For every i the unary predicate M® is intrinsically regular in the 
structure (IN, <). 

Proof. Let (D,<o) be an automatic presentation of (IN, <) over S. We prove 
the case when i = 2; the case when i > 3 can be proved in a similar way. Let 
E C D he the set of words corresponding to the set of all even natural numbers. 
Then x £ E iS {y G D \ y <d x} has odd cardinality. Our goal is to define an 
automaton over S that accepts all such strings x. A rough idea is that the new 




446 B. Khoussainov, S. Rubin, and F. Stephan 



automaton we want to build calculates the parity of the number of paths in A 
with second component fixed at x and accepts x when the parity of the number 
of successful paths is odd. 

Let A = {Q A, I' At A A, Fa) be the automaton over E recognising <£>. We 
assume that the automaton A is deterministic. Also, note that since the set 
{y & D \ y <jy x} is finite for any string x G D, we may assume the following. 
For each state s G Qa there are finitely many strings of the form (u,o'") that 
transform the state s into a final state. 

Fix a string x G D and a prefix w of x. For a state s € Qa consider all strings 
V such that |v| = |rc| and the automaton A transforms the string (u, w) to state 
s from the initial state la- Call these strings (w, s) -strings. 

The idea in constructing the desired automaton is this. We use the automaton 
A. Processing the initial prefix w of x, for each state s, we count the parity of 
the number of {w, s)-strings. We keep a record of only those states s such that 
the number of {w, s)-strings is odd. By the time we finish processing the string 
X we have a record of all states si, . ■ Sk such that for each state Si there are an 
odd number of {x, Si)-strings. For each Si we count the number Ui of strings of 
the type (Jm), with m = |w|, such that the string transforms Si into a final 
state of A. Then whether or not x G E can be decided based upon the parity of 
the numbers n\, . . ., Uk- Here is a formal description of the desired automaton 
B= (Qb,lb,Ab,Fb) over E^: 

1. The set Qs of states of B are all subsets of Qa- 

2. The initial state lb of B is {la}- 

3. Ab{X, a) = Y, where Y consists of all states s G Qa such that there are an 
odd number of pairs (s',cr') for which s = Aa{s' , (‘^)), s' G X and a' G E^. 
(Note that Y could be empty). 

4. The set of final states Fb is defined as follows. Assume X = {si, . . . , Sk} is a 
subset of Qa- For each Si, count the number m of all strings of the type {^m), 
with V G E* , m = |u|, such that the string {^m) transforms Sj into a final 
state of A. Then X G Fb if and only if A yf 0 and the number m + . . . + nk 
is odd. 

Let X = do . . . cr„ be an input string for B. Let m be the cardinality of the set 
{y I y Ed x}. Let Xq = {la}, Xi, . . X^+i be a run of B on x. The automaton 
has the following property (*) that can be proved by induction on i > 1: 

(*) A state s is in Xi if and only if the number of (cto . . . s)-strings is odd. 

Let Xn+i = {si,...,s„}. For each Si G A„+i the number of (si, x)-strings is 
odd. Consider the number Ui of all strings of the type (Jm), with m = |u|, such 
that the string transforms Si into a final state of A. From the definition of 
final states for B, and the inductive assumption on A„, we see that the cardinality 
of the set {y \ y <d x} is odd if and only if A„ is non-empty and the number 
ni + U 2 + ■ ■ ■ + Uk is odd. The theorem is proved. □ 

As mentioned in the introduction, this proof can be generalised as follows. 

Theorem 3.2. Let A be an automatic structure. There exists an algorithm that 
from any definition <t> of a relation R produces an automaton recognis- 
ing R. In particular, C IR{A). 
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Proof (sketch). Constructing an automata that recognises relation definable 
by 3(*) yil){x,y) formula is done in a style similar to the proof of Theorem 3.1. 
Now note that 3°°y^l){x,y) is equivalent to 'iz3y{y ^ zk(j){x,y)), where ^ is 
the length-lexicographic ordering on the domain of A. □ 

Theorem 3.3. IR(JN,<) = P0“’“°'^(1N, <) = FO{TN,<, , M^, . . .). 

Proof (sketch). By Theorem 3.2, F0°°’™°'^(1N, <) C IR(1N, <). So suppose 
that a relation R is intrinsically regular for (IN, <). Then R is regular in every 
automatic presentation of (IN, <). Consider the structure (1*, <) where < stands 
for the ordering induced by the one on the length: 1” < 1™ whenever n < m. This 
structure is a (unary) automatic presentation of (IN, <) and hence the image of 
R in this presentation is regular. It is shown in [3] that the regular relations over 
the unary alphabet coincide with those that are first order definable in structure 
(IN, <, M^, M^, . . .). This can be done, for example, via an analysis of finite 
automata recognising relations over the unary alphabet. Finally, suppose R is 
first order definable in (IN, <, M^, M ^, . . .). Since the M* are definable 

in (IN, <), then so is R. □ 

We end the section with a simple application of Theorem 3.2 which gives a 
generalization of Theorem 3.1. A tree R = (T, ^) is a partially ordered set with 
a least element (the root) and for which every set of the form {y & T \ y < x} 
is a finite linear order. The level n of T is the set of all x G T such that the 
cardinality oi {y & T \ y < x} \s n. 

Corollary 3.4. Let {T,<) he an automatic tree. Given n G IN, the set {x G T | 
X is on level n ■ m for some m G IN} is a regular subset ofT. 

4 Intrinsic Regularity in (E^, S) 

Consider the structure (IN, S'), where S is the successor function. Our goal is 
to show that in this structure, all intrinsically regular unary relations are those 
that are either finite or co-finite. We are also interested in providing automatic 
presentations of (IN, S) in which some familiar relations are regular and some 
not. Recall that the finite or co-finite subsets are the only unary relations of 
this structure that are first order definable, a property that easily follows from 
elimination of quantifiers [4, Theorem 31G]. The next theorem shows that the 
set is not intrinsically regular relation in (IN, S'), and so by Theorem 3.1 
neither is <. 

Theorem 4.1. For every k > 2, there is an automatic presentation of (IN, S) 
in which the image of the set is not regular. 

Proof. Fix k > 2 and let U = {0, 1, . . . , A: — 1}. We construct an automatic 
structure {S* , f) isomorphic to (IN, S). To do this, for any given string x G if*, 
we introduce the following auxiliary notations: ep{x) is the string represented 
by bits of x at even positions; op{x) is the string represented by bits of x at odd 
positions; n and m are the lengths of strings ep(x) and op{x), respectively. We 
may also treat ep{x) and op{x) as natural numbers written in least-significant- 
digit-first base k, and in particular perform addition on them. For example, if 
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X = 0111001 then ep{x) = 0101, op{x) = 110, n = 4 and m = 3; note that 
m < n < m+1 and |x| = m + n. We may regard the string x as the ordered pair 
of strings, written (ep{x) , op{x)) , and think of op{x) as a parameter. Call strings 
X for which ep{x) = midpoints and strings for which ep{x) = 0 modulo 

/c" startpoints. Now we describe rules defining the function /. In brackets [[ like 
this ]] we explain the meaning of each rule if needed. We note in advance that 
all arithmetic is performed modulo fc". Define an auxiliary function next (a;) = 
ep{x) + kop{x) + k — 1 modulo fc". 

1. If n < 2 then f{x) is the successor of x with respect to length-lexicographic 
ordering. 

2. If (next(x), op{y)) is neither a midpoint nor a startpoint then f{x) = y, where 
^p{y) = next(a:) and op{y) = op{x). [[This is the generic case according to 
which the successor of the string x, regarded as the pair (ep{x) , op{x)) , is 
(next(cc),op(a;)). ]] 

3. If (next(x), op(y)) is a midpoint then f(x) = y, where \y\ = |a:|, ep{y) = 
ep{x) -I- next (next (a;)) modulo /c" and op{y) = op{x). [[This case says that if 
adding next(a;) to ep{x) produces a midpoint then the midpoint should be 
skipped. Note that ep{y) = ep{x) + 2next(a;).]] 

4. If (next(x), op(x)) is a startpoint then /(x) = y, where |y| = |x|, ep{y) = 

and op{y) = op{x). [[ The successor of the endpoint is the midpoint. ]] 

5. If {ep{x) , op{x)) is a midpoint and op{x) < k^ — 1 then f{x) = y, where 
|j/| = |x|, ep{y) = 0 and op{y) = op{x) + 1 modulo fc”. [[This is the case when 
the parameter op{x) is incremented by 1, and the string ep{x) is initialized 
to the string consisting of n zeros.]] 

6. If {ep{x) , op{x)) is a midpoint and op{x) = A:™ — 1 then f{x) = 0"+™+^. 
[[This is the only case when the length of string x increases by one.]] 

Now we explain how / acts. Fix 6 G IN congruent to fc — 1 modulo k. For every 
a G IN there is a unique number c G {0, 1, . . . , Ac" — 1} such that a = b - c modulo 
A:”. In other words, every element c G {0, 1, . . . , A:” — 1} appears exactly once 
in the sequence 0, b, 26, 3b, , (A:" — 1)6, where elements are taken modulo A:". 
Moreover, k'^~^b equals modulo fc". Hence, k'^~^b appears in the middle 

of this sequence. Let us assume that x is such that ep{x) = 0 and let 6 = 
kop{x) + k — 1. Then by rules 2, 5 and 6, the function / consecutively applied 
A;” — 1 times to (0, op{x)) produces the following sequence: 

(0,op(x)),(6,op(x)), .. . , (A:”-i - b,op{x)),{k'^-^ +b,op{x)), 

. . . , (A:" - b,op{x)), (A:”-i,op(x)). 

Note that the midpoint {k^~^,op{x)) has been removed from the middle of the 
sequence (0, op{x)), (6, op{x)), . . . , (A;” — 6, op{x)), and placed at the end. Finally 
rules 3 and 4 imply that / applied to the last string v in the sequence produces 
the string (0, op(x) -I- 1) if op{x) yf A:™ — 1; otherwise f{v) = 0"+™+^. This 
completes the description of /. 

The function / is FA recognisable because all the rules used in the definition 
of / be can tested by finite automata. It can be checked that {S* , f) is isomorphic 
to (IN, S), say via mapping tt : A* — >• IN. 
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Our goal is to show that the image of the set = {x | x is a multiple 
of k} is not regular in the described automatic presentation of (IN, S'). For this 
we need to have a finer analysis of the isomorphism tt from (17*,/) to (IN, S). 
Denote by x' the string (0,op(x)). One can inductively check the following for 
the case that n > 3. 

1. The number Tr(x') is congruent to 0 modulo k for all non-empty strings x. 

2. There is a unique u < k^ — 1 such that ep{x) = u ■ {kop{x) -I- A: — 1) modulo 
fc". Moreover: 

a) If M < then 7t(x) = Tr(x') -I- u. 

b) If M > then 7t(x) = Tr(x') -I- u — 1. 

c) If M = then 7t(x) = Tr(x') -I- fc" — 1. 

3. If ep{y) = 0 and op{y) = op(x') + \ <k"^ — 1 then 7 t(?/) = Tr(x') -I- fc”. 

Thus, from the above it is easy to see that x is in the image of iff either 

u < and u is congruent to 0 modulo k or u > k^~^ and u is congruent to 
1 modulo k. In order to show that the image of is not regular, consider all 
the strings x such that n is odd, ep{x) = 1” (its numerical value is — 1), 
op{x) = (its numerical value is — 1, so that kop{x)+k—l = A:’'+^ — 1), 

and n > r -|- 4. Then under these premises for every r g IN the minimal n g IN 
for which x g Tr(M^) is when n = 2r -I- 5: 

Indeed, (A:"“^ -I- -I- 1) • (A:’’’*'^ — 1) = A;^'’+‘* — A:"“^ — 1 modulo A;". So 

under the assumption that n = 2r + 5, this is equal to —1 = ep{x) modulo A:". 
Hence u = A:"“^ -I- -I- 1 > A:"“^ and so by item 2b above conclude that 

7t(x) = 7t(x') -I- A;"“^ -I- k^~^^ and so x g 

For the converse, (A:’’’*'^ -I- 1) • (A:’’+^ — 1) = — 1 modulo A:”. Hence under 

the assumption that n < 2r -|- 5, this is equal to — 1 = ep{x) modulo A:”. Now 
if further r -|- 3 < n — 1, then u = A;’’+^ -I- 1 < A:"“^, and so by item 2a above 
conclude that 7 t(x) = Tr(x') -I- A:’'+^ -I- 1 and so x ^ 7t(M^). 

Now we can check that 7 t(M^) is not regular. Note that in the presence of 
n = 2r -|- 5 the assumption that n > r -|- 4 is redundant since n < r -|- 4 implies 
that r < — 1 which contradicts that r g IN. So consider the non regular set 

Y = {x G S* I ep{x) = 1", op{x) = 0'"“’'!’’, n = 2r -I- 5}. 

It can be defined from as the set of all x g 27* such that ep{x) = 1”, for 

some odd n, op{x) = 0'"“’'!'’ for some m, r g IN, n > r -I- 4, x g and if 

r -I- 4 < s < n then (D,op(x)) ^ 7t(M^). But since Y is not regular, neither is 
as required. □ 

Corollary 4.2. A unary relation i? C IN is intrinsically regular in (IN, S) if and 
only if it is in FO“’“°‘^(IN, S'). 

Proof. The reverse direction is immediate. For the forward direction it is suf- 
ficient to prove that if i? C IN is intrinsically regular in (IN, S) then it is finite 
or co-finite; in this case it is in FO(IN, S) and so certainly in FO°°’™°'^(IN, S). 
It can be proved that if R is an eventually periodic set, and if it is infinite and 
co-infinite, then there is some period p oi R such that Mp is first order definable 
(IN, S, R). Assuming this proceed as follows. Let i? C IN be intrinsically regular 
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in (IN, S'). Since (1*, (g){(l", 1”+^) | n G IN}) is an automatic presentation of 
(IN, S), R must be eventually periodic. If R is finite or co-finite we are done. 
Otherwise R is regular in every presentation of (IN, S) and using the fact there 
exists a period p oi R such that Mp is first order definable in (IN, S, R) we get 
that Mp is also intrinsically regular in (IN, S) contradicting the previous theo- 
rem. □ 

The first application of the results concerns the reachability relation in automatic 
graphs. The reachability problem for automatic graphs is undecidable, see [3]. 
The reason for this is that such automatic graphs necessarily have infinitely 
many components. In fact for automatic graphs with finitely many components 
the reachability problem is decidable. A natural question is whether or not the 
reachability relation for automatic graphs with finitely many components can be 
recognised by finite automata. To answer this question, consider the following 
graph 0 = ({0, 1}* ,Edge), where Edge{x, y) if and only if p{x) = y and / is the 
function defined in the proof of Theorem 4.1 for k = 2. The graph Q is automatic 
with exactly two infinite components each being isomorphic to (IN, A). One of 
the components coincides with and so neither component is regular. Hence, 
we have the following: 

Corollary 4.3. There exists an automatic presentation of a graph with exactly 
two connected components each isomorphic to (IN, S) for which the reachability 
relation is not regular. 

A final application of this theorem is on the structure {2Z, S). A cut is a set of 
the form {x G Z \ x > n|, where n G Z is fixed. 

Corollary 4.4. There is an automatic presentation of S) in which no cut 
is regular. 

Proof (sketch). It is sufficient to find a presentation of (^, S', 0) in which 
{a; G ^ I X > 0} is not regular since every other cut is first order definable from 
this one. We modify the presentation in the proof of Theorem 4.1 for A: = 2, by 
considering the structure ({0, 1}*,^), where g is defined using the same notation 
as before. All arithmetic below is performed modulo 2”. 

1. If n < 2 then g{x) is the length- lexicographic successor of x. 

2. If {ep{x) + 2op(x) + l,op(x)) is neither a midpoint nor a startpoint then 
g(x) = y with |x| = |j/| and ep{y) = ep{x) + 2op{x) + 1 and op{y) = op{x). 

3. If {ep{x) + 2op{x) + 1, op(x)) is a midpoint, then 

a) if op(x) < 2™ — 1 then g(x) = y with \x\ = |y| and ep{y) = 0 and 
op{y) = op{x) + 1. 

b) if op{x) = 2"^ — 1 then g{x) = y with \y\ = |x| -I- 1 and ep{y) = 0 and 
op{x) = 0. 

4. If {ep{x) + 2op{x) + 1, op{x)) is a startpoint, then 

a) if op{x) < 2™ — 1 then g{x) = y with |x| = |j/| and ep{y) = 2”“^ and 
op{y) = op(x) + 1. 

b) if op(x) = 2™ — 1 then 

i. if n = 3 and m = 2 then g(x) = e. Otherwise, 
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ii. if n = TO + 1 then g{x) = y with \ep{y)\ = n — I, \op{y)\ = m and 
ep{y) = 2”“^ and op{y) = 0. 

iii. if n = TO then g{x) = y with \ep{y)\ = n and \op{y)\ = m — 1 and 
ep{y) = 2”“^ and op{y) = 0. 

Thus, ({0, 1}*, (/, e) is an automatic presentation of in which the cut 

above 0 is exactly those x such that m < 2" — 1. But if this set were regular then 
so is the image of in ({0, 1}*, /). □ 

Finally we mention that there is an automatic presentation of (IN, S) in which < 
is not regular but all the unary relations M^, M^, . . . are regular. This shows that 
regularity of each of the sets M* and the successor relation S do not generally 
imply that the relation < is regular. Together with the previous result this 
theorem says that regularity of < is independent of whether or not sets M* are 
regular. The proof is available in the full version of this paper. 

Theorem 4.5. The structure (IN, S, M^, M ^, . . .) has an automatic presentation 
in which the relation < is not regular. □ 
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Abstract. An Active Context-Free Game is a game with two players 
(Romeo and Juliet) on strings over a finite alphabet. In each move, 

Juliet selects a position of the current word and Romeo rewrites the cor- 
responding letter according to a rule of a context-free grammar. Juliet 
wins if a string of the regular target language is reached. We consider 
the complexity of deciding winning strategies for Juliet depending on 
properties of the grammar, of the target language, and on restrictions on 
the strategy. 

1 Introduction 

This work was motivated by implementation issues that arose while developing 
active XML (AXML) at INRIA. Active XML extends the framework of XML 
for describing semi-structured data by a dynamic component, allowing to cope 
with e.g. web services and peer-to-peer architectures. For an extensive overview 
of AXML we refer to [2,3,11]. 

We briefly describe here the background needed for understanding the mo- 
tivation of this work. Roughly speaking, an AXML document consists of some 
explicitly defined data, together with some parts that are defined only inten- 
sionally, by means of embedded calls to web services [3,9,7]. An example of 
an AXML document is given in Figure 1. An important feature is that the 
call of a web service may return data containing new embedded calls to fur- 
ther web services (see Figure 1). Each web service is specified using an active 
extension of WSDL [17], which defines its input and output type by means 
of AXML-schemes which in turn are an immediate extension of XML-schemes 
with additional tags for service calls. For instance, the specification of the service 
www.meteo.fr can be string — >■ string while the specification of www.aden.fr 
can be 0 — 1- Operas*Movies* Outdoor* where Opera, Movies, Outdoor 
is either string* or a pointer to a web service. 

Whenever a user or another application requests some data, the system must 
decide which data has to be materialized, in order to satisfy the request specifi- 
cation. An important issue is then which services are called and in which order. 
Assume for instance for our example in Figure 1 that there is a fee for each 
service call. If the request requires to minimize the overall costs, the system 
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Name Meteo Events 

Paris @www. meteo. fr(Paris) @www. aden.fr 



AXML document 



City 




Name Meteo Events 




Paris 26‘’/rain Operas Outdoor 

@www. opera.fr Marathon 
Same document after service calls 



Fig. 1. 



should first call www.aden.fr in order to get the list of events and only call the 
weather forecast if there is some available outdoor event. The requests we are 
considering in this paper ask for all available data of a given type as specified 
by an AXML-schema. 

The system has access to local data, service specifications and a request 
specification. This can be modeled as follows (see [12,1,15]): (i) the local data 
is an AXML document corresponding to a labeled, unranked tree, (ii) the in- 
put/output type of a service specification is specified by a regular tree language, 
and, (iii) the requested data is also modeled by a regular tree language. 

As this problem turns out to be computationally difficult, we consider a 
simpler version. Actually, even this simpler variant is undecidable without any 
further restrictions. First we assume that services do not have any input. Note 
that services with a fixed number of different inputs can be modeled by consid- 
ering several different services, one per input option. Secondly, we assume that 
the output type consists of finitely many options, that is the regular language is 
in fact a disjunction of finitely many cases. Finally, we deal with strings instead 
of trees. This simplifies the combinatorics and allows a better understanding of 
the problem. Thus, the problem we consider here is stated as follows: given (i) 
a string, (ii) a set of service specifications of the form A ^ u\ | • • • | Un, where 
A is a letter and the Ui are strings, and (iii) a regular string language, can we 
decide which services to call and in which order, such that the string eventually 
obtained belongs to the regular language representing the target? We formalize 
this problem in terms of games. We discuss extensions of our framework to trees 
and full regular languages in the last section of the paper. 

An Active Context-Free Game (CF-game) is played by two players (Romeo 
and Juliet) on strings over a finite alphabet. Its rules are defined by a context- 
free grammar (CFG) and its target by a regular language given by a regular 
expression (equivalently, a non-deterministic automaton, NFA). In each move, 
Juliet jumps to a position of the current word and Romeo rewrites the cor- 
responding letter according to some rule of the grammar. Juliet wins a play 
if the string obtained belongs to the target language. The intended meaning is 
obvious: Juliet is the system, Romeo the environment, the CFG corresponds 
to the service specification and the target language to the request specification. 
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We consider the complexity of deciding the existence of a winning strategy 
for Juliet in two variants. The first one, called combined complexity, means that 
both the specification of the game and the initial string are given as input. In the 
second variant, called data complexity, we fix a game specification and a target 
language, and the input consists of a string, only. It shows how the complexity 
behaves relatively to the length of the string. This can be motivated by the fact 
that the specification of the system is often fixed once and for all, while the data 
may frequently change. The data complexity measures then the difficulty of the 
problem after preprocessing the specification. 

We show that without any restrictions, there is a fixed CF-game for which 
data complexity is already undecidable. Thus we consider simpler variants of the 
problem by restricting the set of rules, the regular target language, or the strat- 
egy. The above example already suggests two restrictions. First, both service 
calls give rise to one new service call tag, only. This means that the underlying 
CFG is linear. We also consider the more restricted case of unary grammars, 
where a service call may only return another service call or some data without 
any service call tag. A more realistic restriction, that is satisfied by the above 
example and probably by most applications, is that the iterated answer of ser- 
vice calls does not give back a tag with the same service call. This restriction 
corresponds to non-recursive CFGs and to non-recursive CFGs of given depth 
(bounded CFGs). The problem is decidable for all these restrictions, although it 
is intractable in some cases (e.g., ExpSpace for non-recursive grammars without 
uniform depth bound). We also consider left-to-right strategies where Juliet has 
to traverse the string from left to right. In the above scenario this amounts to 
having a heuristics for parsing the data tree only once, such that if the system 
decides not to call a service, it never comes back to this service again. This lim- 
its drastically the possibilities of the system but also decreases significantly the 
complexity of the problem. Combined with general CFGs the decision complex- 
ity is 2ExpTime and combined with non-recursive rules it is ExpTime. But for 
all other restrictions the complexity is at most PSpace. This restriction allows 
for a uniform decision procedure (and very efficient preprocessing as well) as an 
automaton accepting all winning configurations (strings) can be computed from 
the CF-game independently from the input string. To further decrease the com- 
plexity we also consider games where the specification of the target language 
is given as a deterministic automaton (DFA). In the case of bounded CFGs, 
and left-to-right strategies we end up with a tractable PTime decision proce- 
dure. This case seems rather restrictive at first sight, but it is general enough to 
handle many practical cases and it has been implemented in AXML [11]. 

Figures 2 and 3 summarize our results. The numbers in brackets refer to the 
corresponding theorem or lemma, respectively. All complexities are tight. 

Related work. For left-to-right strategies there is a tight connection with 
games on pushdown graphs [16] (see Propositions 1 and 2), which explains the 
decidability for arbitrary CFGs. A question related to the game problem is that 
of verifying properties of infinite graphs defined by CF-games (model-checking). 
Similar questions have been asked, e.g., for automatic graphs [4], process rewrit- 




Active Context-Free Games 



455 



Rules 

Restriction 


Combined Complexity 
NFA/DFA 


Data Complexity 


general 
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linear 
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ExpTime (1) 
ExpTime (1) 



Fig. 2. Unrestricted strategies 
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Fig. 3. Left-to-right strategies 



ing graphs [10] and ground tree rewriting graphs [8]. For instance, [8] considers 
CTL-like versions of the reachability problem in ground tree rewriting graphs. 
Graphs generated by CFGs on strings can be seen as a special case of ground tree 
rewriting graphs and therefore the undecidability result obtained in [8] follows 
from our Theorem 1. 

Overview. The paper is organized as follows. Section 2 gives formal definitions 
and fixes the notation. It also describes a couple of extensions of the basic game 
which are used in the lower bound proofs. The results on arbitrary GFGs are 
given in Section 3. Non-recursive and linear GFGs are considered respectively in 
Section 4 and Section 5. Due to lack of space several proofs are omitted. 

2 Definitions 

A GF-game is a tuple G = {S, R, T), where if is a finite alphabet, R C S x if+ 
a finite set of rules and T a regular target language. Note that the rewriting rules 
do not allow the empty string on the right-hand side. We call a symbol A of S 
a non-terminal if it occurs on the left-hand-side of some rule in R, otherwise a 
terminal. 

A play of the game G is played by two players, Juliet and Romeo, which 
play in rounds. In each round, first Juliet selects a position and then Romeo 
chooses a rewriting rule associated to the letter of the chosen position. 

A configuration G of the game is a tuple (w,i,c) where w is a string {the 
current word), t < jwj is a number {the current position) and c is either pos or 
rule. A position choice in configuration {w,i, pos) consists of selecting a position 
j < jicj resulting in (■u;,j, rule). A rule choice in configuration (oi ••• a„,j, rule) 
consists of replacing Oj by a string u such that aj — >■ m is a rule of G. The 
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result is (oi • • • aj-iuaj+i ■ ■ • an,j, pos). A play starts in an initial configuration 
Co = {w, 1, pos), for some string w. 

The play stops and Juliet wins if after some round the resulting string is in 
T. Otherwise it goes on. Romeo wins immediately, if Juliet chooses a position 
j, whose corresponding symbol is terminal. As usual, we say that Juliet has a 
winning strategy in configuration (w,i,c) if, no matter how Romeo plays, T is 
reached within a finite number of moves. 

Note that the winning condition for Juliet is in the first level of the Borel 
hierarchy (reachability of a set). By Martin’s determinacy theorem, CF-games 
are thus determined i.e., from each configuration one of the two players must 
have a winning strategy, [6] . 

We consider the decision problem for Juliet to have a winning strategy in 
G on a string w. This comes in two flavors, combined decision problem and data 
decision problem. The combined decision problem is: 

[Combined] INPUT: A CF-game G = {S, R, T), a string w 

OUTPUT: True iff Juliet has a winning strategy in G on w. 

The data decision problem associated with a CF-game G is: 

[Data(G)j INPUT: A string w 

OUTPUT: True iff Juliet has a winning strategy in G on m. 

We say that Juliet has a left-to-right winning strategy if she can always 
choose a position which is bigger or equal to the position chosen in the preceding 
move. We call the set R of rules of a game unary if each rule is of the form A ^ B 
with B € B. We call it linear if each right-hand-side of a rule contains at most 
one non-terminal. The set R is called non-recursive if no symbol A can be derived 
from A by a non-empty sequence. For a non-recursive set R we call the maximal 
depth d of a leaf in a derivation tree of R the depth of R. A CF-game G is unary 
(resp. linear, non-recursive) if its set R of rules is. 

Extended games. In the lower bound proofs we make use of several exten- 
sions of the basic CF-game in order to simplify reductions. It turns out that the 
complexity of the decision problems does not change in many cases (we omit the 
proof of this result in this abstract) . These extensions are 

— navigation constraints: basically regular expressions associated with a rule, 
which restrict the possible position choices for Juliet in the next move. As 
an example, Juliet can be forced to choose the next position immediately 
to the right of the current one; 

— symmetric rule choice: symbols for which Juliet, instead of Romeo, chooses 
the rule; 

— concatenation of games: games may consist of several successive phases. 

For unrestricted strategies, every game G with all these features can be simulated 
by a usual game G' in polynomial time and every string w can be translated 
into a string w' such that Juliet wins (G, w) iff she wins (G', w'). Furthermore, 
unarity, linearity, non-recursiveness and even boundedness are preserved (but not 
all combinations, e.g. unarity and boundedness are not preserved at the same 
time). A similar result holds for left-to-right strategies. Navigation constraints 
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behave in an analogous way. For symmetric rule choice, unarity is only preserved 
if all rules have navigation constraints. Concatenation of games does not seem 
to work here. 



3 Unrestricted Rules 

The section is divided into two parts. In the first one we consider unrestricted 
strategies, while the second one is devoted to left-to-right strategies. 

Unrestricted strategies. We prove first that in general, both decision prob- 
lems are undecidable. We will make use of the following lemma which establishes 
a close connection between computations and CF-games. 

Lemma 1. Let M be an alternating Turing machine with space hound s{n)and 
initial state < 70 . We can construct in polynomial time a unary game G = {S, R, T) 
such that the following assertion holds: 

For every input w = Oi • • • a„ to M , Juliet has a winning strategy for G on 
the string w' = $(( 7 oi 01)02 • • • o„ if and only if w G C{M). 

The proof idea for the lemma above is to simulate a computation path of the 
alternating TM by letting Juliet play in existential configurations (symmetric 
rule choice) and Romeo in universal configurations. One single transition is 
simulated by a sequence of game moves, in which we use navigation constraints 
for forcing the players to rewrite the symbols affected by the transition. 

From the theorem below it follows that both decision problems of CF-games 
are undecidable: 

Theorem 1. There exists a CF-game G for which the data decision problem is 
undecidable. 

Proof (Sketch): We reduce the Post correspondence problem (POP) to Data(G), 
for some fixed game G. An instance of PCP is given by two sequences ui, . . . , 
and Hi, . . . ,Hn of finite words over {a, 6}* for some n G N. The problem is to 
check whether there exist m > 0, ii, . . . , im, such that Ui^ - ■ ■ Ui^ = Vi^ - ■ ■ Vt^. 
The game is played on the string 

w = $ uifeHifeOl ■U 2 &H 2 &OOI • • • We refer to the prefix of w 

before S by wq. The string w encodes the PCP strings, together with their 
index. The symbol S will generate the solution ii, . . . , im- 

The game has two phases. First, Juliet uses (non-linear, recursive) rules 
{S' — >■ SqSiS, So — >■ 0, Si — >■ 1, # — >■ U#} to generate the string 
■iHo(0Si)*i(Sol)*^ • • • (0Si)*”‘S ff. In phase two it is checked that this string 
encodes a solution to PCP. As a deterministic TM can do this in linear space, 
Lemma 1 guarantees that it also can be done by a CF-game. □ 

Left-to-right strategies. In this section we consider only left-to-right 
strategies. Here, all problems become decidable, but the complexity depends 
on the representation of the target language. We first show that when the target 
language is given as a DFA, CF-games are closely related to pushdown games. 
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We then show that when the target language is given as a NFA that there is an 
inherent exponential blowup. 

A reachability pushdown game is played on a graph G-p associated with an 
alternating pushdown system V = {Q = Qe^ QA,r,S,F). The nodes of Gp 
are the configurations {q, u) G Q x F* of V. The set Q of states is partitioned 
into existential {Qe, Eve’s states) and universal {Qa, Adam’s states) states, 
and a node of Gp is existential (universal, resp.) if its control state is existential 
(universal, resp.). The transition relation dCQxFxQxF* determines the 
edge relation {q,u) h {q',u') of Gp. 

Eve wins the reachability game if whatever Adam’s choices are, she can 
reach a final configuration. It is known that deciding whether a configuration is 
winning is ExpTiME-complete, [16]. Moreover, the set of winning configurations 
can be described by an alternating automaton of exponential size, [5,14]. 

The next two propositions show the relation between pushdown games and 
CF-games with left-to-right strategies and DFA target language. 

Proposition 1. Given a game G = (E,R,T), where T is a DFA with initial 
state qo, we can construct in polynomial time a pushdown system V such that 
Juliet wins the game G on w if and only if the configuration c = (qQ,w$) is 
winning in the reachability pushdown game. 

Proof. Let Q denote the set of states of the DFA T and 6t its transition function. 
The states of V are Q U Q U {/}, with Q existential states, Q universal ones and 
/ the unique final state. The stack symbols are E U {$}, where % ^ E. 

For every pair q G Q, A G E we have the transitions 6{q,A) = 
{(Jt(<Z, A), pop), {q,A)}. The transitions correspond to Juliet either skipping 
the current position (pop), or selecting it and letting Romeo play next. For 
every pair (q, A) and every rule A — >■ u of G we have a transition {q, u) G S{q, A). 
These transitions correspond to Romeo choosing the corresponding rule in R. 
Finally, we add the transitions {q, $, /, $) for every accepting state q ofT. □ 

Proposition 2. Given a pushdown system V we can construct in polynomial 
time a game G = (E,R,T), where T is a DFA, such that for any configuration 
c= (q, Ai • • • A„) ofV, c is winning in the pushdown game if and only if Juliet 
wins the game G on w = (q, Ai)A2 • • • A„. 

Proposition 2 is shown by a game simulation using extended games. From propo- 
sitions 1 and 2 and [16,5,14] we obtain immediately: 

Theorem 2. Given a game (E,R,T), where T is given as a DFA, it is 
FxpTime- complete to know whether Juliet has a winning left-to-right strat- 
egy- 

Moreover the set of input words for which Juliet has a left-to-right win- 
ning strategy is regular and an alternating automaton that recognizes it can be 
constructed in exponential time from (E,R,T). 

Note 1. Proposition 1 above cannot be extended to CF-games with a non- 
deterministic target automaton. Indeed consider the following example. 
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In the game G = {U, G, A) where G = {b ^ c \ d}, Juliet 
has a winning strategy on ab\ rewrite b, as both ac and ad are 
accepting for A. But in the pushdown system as constructed 
in the proof of Proposition 1, Adam has a winning strategy 
on (<7o, o6$): after reading a, Eve has to commit to state qi or 
to state Q2- Depending on Eve’s choice at that state, Adam 
will choose respectively d and c for replacing b and thus will 
end in a non accepting configuration with state respectively 
Q4 and (75. 

Theorem 3. Given a game {E, R, T) with T given by an NFA, it is 2ExpTime- 
complete to know whether Juliet has a winning left-to-right strategy. 

Proof (Sketch): As T can be transformed into an exponential size DFA, the 
2ExpTime upper bound follows immediately from Proposition 1 and the fact 
that the winner in a pushdown game can be determined in exponential time in 
the size of the game. The lower bound is shown by simulating the behavior of 
an alternating exponential space Turing machine M on input x. Starting from 
SEzx, during a first phase the players keep rewriting the leftmost symbol only, 
generating a sequence of configurations of M . Each configuration is encoded 
by a sequence of (symbol, position)-pairs, where the position is an exponential 
size number encoded in binary. The alternation of M is mimicked by alternating 
between Juliet- and ROMEO-choices (symmetric rule choice). In a second phase, 
it is checked in a single left-to-right pass, that the outcome of phase 1 really 
encodes an accepting computation of M on x. That is, Romeo gets the chance 
to object each position of the current string, i.e., replace by a special symbol. 
Then Juliet wins immediately if the objected position is correct, hence she wins 
the game if there is no error in the outcome of the first phase. It is crucial here 
that an NFA of polynomial size in n can express that j ^ i and j yf z -I- 1, for 
two counter values z, j < 2”. □ 

4 Non-recursive Rules 

In this section we focus on non-recursive games. We also consider hounded games, 
i.e., with depth bounded by some constant d. Non-recursive games are important 
in practice because many applications do not have recursion in the calls of web 
services and the nesting of calls is small. 

Unrestricted strategies. We first consider non-recursive sets of rules. We 
stress that the lower bound of the following theorem does not depend on whether 
the target language is coded as an NFA or as a DFA. Indeed we can show that in 
general, for unrestricted strategies, a CF-game with a target language given by 
an NFA can be reduced in polynomial time to a CF-game whose target language 
is given by a DFA, while preserving unarity, linearity and non-recursiveness. 

Theorem 4. It is ExpSp ACE- complete to decide whether Juliet has a winning 
strategy in a non-recursive game G on string w. 
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Proof (Sketch): We show the upper bound by constructing an alternating TM 
deciding whether Juliet has a winning strategy in exponential time. It main- 
tains on its tape the current game configuration. At each step it checks whether 
the string of the current configuration is in the target of the game. If yes it stops 
and accepts. If not it nondeterministically chooses a position to rewrite and uni- 
versally branches over all possible rewritings. The time of each computation path 
is 0{'wf‘\w\), where m is the length of the maximal word occurring in G and d is 
the depth of G. The lower bound is obtained by simulating an exponential time 
alternating TM by a game. □ 

We now consider games with set of rules of depth bounded by some given 
d. The first lemma is used in lower bound proofs to get down from depth d to 
depth 1 , the second lemma shows that the data decision problem is already hard 
for d = 1 and unary rules. 

Lemma 2. For each d> 1 there are polynomial-time computable functions G i— >■ 
G" and {w, G) i— >■ w' transforming any d-bounded CF-game G into a 1-bounded 
CF-game G' , such that Juliet has a winning strategy in (G,w) if and only 
if she has a winning strategy in (G',w'). Furthermore, linearity, unarity and 
deterministic target are preserved. 

Lemma 3. There is a unary CF-game G = {E, R, T) of depth 1 such that 
Data(G) is PSPACE-hard. 

Proof (Sketch): We use a reduction from the quantified Boolean satisfaction 

problem QBS. The input of QBS is a formula in prenex normal form, with the 
quantifier-free part in 3CNF. 

The extended game we construct consists of two phases and is played on 
a straightforward string encoding of The first phase is a left-to-right pass 
in which (i) each variable is rewritten by a truth value - by Juliet for ex- 
istentially quantified variables and by Romeo for universally quantified vari- 
ables, (ii) a clause is selected by Romeo, and, (iii) a literal in the clause by 
JuLiET(symmetric rule choice). The second phase checks that the literal be- 
comes true by the variable assignment. As R and T have to be independent of 
<P, variable names are encoded as binary strings. Therefore, this check has to 
be done through the game by going back and forth between the value of each 
variable and the clause in which it is used. □ 

When d is fixed, 0{m‘^\w\) is a polynomial bound. Therefore, from the above 
lemma and the upper bound proof of Theorem 4 we get: 

Theorem 5. For each d, given a CF-game G with rules of depth bounded by d 
and a string w it is PSpace- complete to tell whether Juliet has a winning strat- 
egy for G on w. Furthermore, there is a game G of depth 1 for which Data(G) 
is PSpace- complete. 

Left-to-right strategies. We continue with non-recursive rules of depth 
bounded by some d, but we now concentrate on left-to-right strategies. Recall 




Active Context-Free Games 



461 



Lemma 3 which shows that the data complexity of non-recursive games for unre- 
stricted strategies is PSPACE-hard. With combined complexity PSPACE-hardness 
can now be obtained with only one pass. 

Lemma 4. For each d > 1, it is PSPACE-hard to tell, for a unary game G = 
(S,R,T) of depth d and a string w, whether Juliet has a left-to-right winning 
strategy. 

From Lemma 4 and an immediate adaptation of the proof of Theorem 5 for 
left-to-right strategies we obtain: 

Theorem 6. For each d > 1, it is PSPACE-compZete to tell, for a game G = 
{S, R, T) of depth d, where T is given by an NFA, and a string w, whether 
Juliet has a winning left-to-right strategy for G on w. 

When the target language is given as a DFA the decision becomes tractable. 
The PTime upper bound of the left-to-right, bounded, DFA target language 
case was already obtained in [11] using automata theoretical techniques. It is 
also the framework which has been implemented in AXML [11]. 

Theorem 7. For each d > 2, given a game G = {S, R, T) of depth d and a 
string w where R is non-recursive and where T is a deterministic automaton, 
it is PTime- complete to tell whether Juliet has a winning left-to-right strategy 
for G on w. 

Proof (Sketch): We prove the upper bound by constructing an alternating Tur- 
ing machine deciding whether Juliet has a winning left-to-right strategy in 
logarithmic space. The machine has one pointer per level in the rewriting tree 
corresponding to a position of the input. Those d pointers are sufficient for the 
TM to specify the rightmost part of the current configuration which still needs 
to be processed. 

The lower bound is proved by a reduction from the monotone Boolean circuit 
value problem [13]. Let C be a Boolean circuit. We can assume w.l.o.g. that all 
paths in G are alternating between or and and gates, have fan-in two and 
start/end with and gates [13]. From C we construct a DFA which accepts all 
strings over {l,r} which describe paths from the output gate to an input gate 
which is 1. The extended game is played on a blank string of length depth(C') 
and Juliet and Romeo select such a path by rewriting each symbol by I or 
r. By doing so Juliet selects the predecessor of or-gates, Romeo of and-gates. 
The circuit is 1 iff Juliet can manage to end up in a 1-gate. Notice that the 
constructed game is also unary. □ 

In the case of non-recursive rules without a uniform depth bound the com- 
bined complexity is one level higher. 

Theorem 8. R is ExpTiME-complete to know whether Juliet has a left-to- 
right strategy for G on w, for G non-recursive and target given by an NFA. 

Theorem 9. R is PSPACE-complete to know whether Juliet has a left-to-right 
strategy for G and w, if G are non-recursive and the target is given by a DFA. 
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5 Linear Rules 

In this section we focus on linear and unary games. Recall from our example that 
in practice, service calls often generate a single subsequent call, which motivates 
linear CFGs. 

Unrestricted strategies. From Lemma 1 it follows immediately that the 
complexity of the data decision problem for unary games is ExpTiME-hard. The 
following lemma shows that the combined decision problem for linear games 
can be done in ExpTime. Hence, for unary and linear games with unrestricted 
strategies all decision problems are ExpTiME-complete. 

Lemma 5. Given a game G = {S, R, T) where R is linear, and a string w, one 
can tell in ExpTime whether Juliet has a winning strategy. 

Proof (Sketch): Let k be the number of non-terminal symbols occurring in w. By 
linearity of R this will be an upper bound on the number of non-terminal symbols 
during the game. We construct an alternating polynomial space Turing machine 
that decides whether Juliet has a winning strategy. Alternation is used to mimic 
the CF-game as usual and memory is used to store the current configuration. For 
the latter the machine needs only to maintain a sequence / 1 H 1 / 2 A 2 • • • A^fk+i 
where the Ai are the non-terminal letters of the current configuration while the 
fi are transition relations of T corresponding to the words between successive 
non-terminal symbols. This requires space 0{k\Q\'^). □ 

Left-to-right strategies. We now consider left-to-right strategies for unary 
and linear CF-games. 

Theorem 10. It is PSPACE-complete to tell, given a unary (resp. linear) GF- 
game whether Juliet has a left-to-right-winning strategy. 

Proof (Sketch): For the lower bound the unary case suffices, for which it follows 
from Lemma 4. For the upper bound it is enough to consider the linear case. 
We check in NPSpace whether Romeo has a winning strategy. Hence, we guess 
the moves of Romeo and, using backtracking, we cycle through all possible 
moves of Juliet. I.e., for the first symbol to replace, we first compute what 
happens if Juliet jumps after one move of Romeo, then after the second, and 
so on. In order to do so, we only need to store a polynomial number of game 
configurations. Each configuration is of polynomial size as in Lemma 5. □ 

When the target language is given as a DFA the complexity decreases in the 
unary case, but not in the linear case: 

Theorem 11. It is PSPACE-compfete to tell, given a linear GF-game with target 
language given by a DFA whether Juliet has a left-to-right- winning strategy. 

Proof (Sketch): The upper bound follows from Theorem 10. We prove the lower 
bound by simulating the behavior of a polynomial space Turing machine M by 
a CF-game with linear rules and a deterministic target automaton. Using linear 
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recursive rules of the form S — >■ Sa Romeo produces a sequence of configu- 
rations of M. When he does a mistake Juliet immediately stops. Therefore 
the mismatch comes from the previous configuration which is within polynomial 
distance (recall that M uses only polynomial space) from the beginning of the 
current string. The target language, a polynomial size DFA, can therefore check 
the mistake by proper counting and comparing the corresponding positions. □ 



Theorem 12. It is PTime- complete to tell, given a unary CF-game with target 
language given by a DFA whether Juliet has a left-to -right- winning strategy. 

Proof (Sketch): The lower bound is done as in the proof of Theorem 7. For 
the upper bound we construct an alternating Turing machine deciding whether 
Juliet has a winning strategy in logarithmic space. Because the grammar is 
unary the length of the word never increases and therefore the input word can 
be processed letter by letter while reading it. For each letter the TM maintains 
on its tape a pointer to the current state in the target automaton and a pointer 
to the current candidate letter for replacing the current position. Looping over 
the alphabet is avoided by using a counter of logarithmic size. Alternation is 
used in a standard way to mimic the CF-game. □ 



6 Discussion 

We have seen that in general is it undecidable to tell who wins a CF-game. We 
have also seen several restrictions on the set of rules and on the strategy which 
imply decidability. A natural interesting situation not considered in this paper 
is the case where the target language T is finite. This is often the case in our 
scenario, as a user may require all data looking exactly like this or that, with 
no other options. If no e-rules are allowed, the game is obviously decidable in 
ExpTime (APSpace) as no useful configuration can be larger than the size of 
T. It is open whether this bound is tight. If e-rules are allowed it is not even 
clear whether the game is decidable. 

As mentioned in the introduction the initial motivation of this work was to 
consider trees and rules defined using extended context-free grammars (rules of 
the form a — >■ Ra where Ra is a regular language). For trees each rule would 
rewrite a leaf labeled a into a finite tree (into a regular language in the extended 
case). We can show that all the results presented in this paper extend to trees. 
However the situation is more complex for extended CFG rules. We can extend 
all our results with left-to-right strategies to these grammars but for unrestricted 
strategies even the non-recursive case can be shown to be undecidable. 

Knowing that there exists a winning strategy is one thing. In practice the 
system needs to know which web service it should call and in which order. This 
correspond to extracting a winning strategy of a CF-game when it exists. We 
can show that this is always possible within the same complexity bounds as for 
the decision problem. 
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Abstract. In 1982 Frieze, Galbiati and Maffioli (Networks 12:23-39) 
published their famous algorithm for approximating the TSP tour in an 
asymmetric graph with triangle inequality. They show that the algorithm 
approximates the TSP tour within a factor of log 2 n. We construct a fam- 
ily of graphs for which the algorithm (with some implementation details 
specified by us) gives an approximation which is logj n/{2-\-2t) times the 
optimum solution. This shows that the analysis by Frieze et al. is tight 
up to a constant factor and can hopefully give deeper understanding of 
the problem and new ideas in developing an improved approximation 
algorithm. 



1 Introduction 

The Travelling Salesman Problem (TSP) is one of the most famous and well- 
studied combinatorial optimisation problems. 

Definition 1. The (Asymmetric) TSP is the following minimisation problem: 
Given a collection of cities and a matrix whose entries are interpreted as the 
non-negative distance from a city to another, find the shortest tour starting and 
ending in the same city and visiting every city exactly once. 

TSP was proven to be NP-hard already by Karp [5] in 1972. This means 
that an efficient algorithm for TSP is highly unlikely; hence it is interesting 
to investigate algorithms that compute approximate solutions. However Sahni 
and Gonzalez [8] showed that in the case of general distance functions it is 
NP-hard to find a tour with length within exponential factors of the optimum, 
this is true even if the graph is restricted to be symmetric. When the distance 
function is symmetric and constrained to satisfy the triangle inequality the best 
known approximation algorithm is a factor 3/2-approximation algorithm due to 
Christofides [2] . With a c-approximation algorithm we mean a polynomial time 
algorithm that outputs a tour with weight at most c times the optimum weight. 

We will study the case when the distance function satisfy the triangle inequal- 
ity but is not limited to be symmetric. This case is much less understood. In 1982 
Frieze, Galbiati and Maffioli [3] invented their famous algorithm for asymmet- 
ric graphs with triangle inequality, which approximates the TSP tour within a 
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factor of log 2 n. There is only a miniscule lower bound: Papadimitriou and Vem- 
pala [7] recently proved that it is NP-hard to approximate the minimum TSP 
tour within a factor less than 220/219— e, for any constant e > 0. Obviously, huge 
improvements can be done either on a better algorithm or a tighter lower bound. 
Despite a lot of effort during the last twenty years there are only two algorithmic 
improvements, both very recent. The first by Blaser was announced in 2003 [1]. 
He improves the algorithm by Frieze et al. and proves a factor 0.999 • log 2 n. The 
second algorithm by Kaplan, Lewenstein, Shafrir and Sviridenko [4] decomposes 
multigraphs and gives an approximation of 3/41og3 n < 0.842 log 2 n. Hence, any 
new insight regarding the asymmetric TSP is important. One way to achieve 
such insight is to identify potential “hard” instances for the known approxi- 
mation algorithms. The algorithms by Blaser and by Kaplan et al. are more 
complicated than the original algorithm due to Frieze et al. and are hence more 
difficult to understand. Therefore, we study the original algorithm in this paper. 
By constructing an explicit family of graphs we establish that the analysis of the 
algorithm is tight up to a constant factor (Theorem 1) . We apply the algorithm 
by Blaser on the graphs and see that it with certain assumptions gives the same 
approximation as the original algorithm hence also Blaser’s analysis is tight up 
to a constant factor. 

The main idea of the algorithm by Freize et al. is: Find a minimum cycle 
cover in the graph by linear programming. Choose one node in every cycle and 
form a subgraph with the same distance function as in the original graph. Find 
a minimum cycle cover in the subgraph. Continue recursively until there is only 
one cycle in the cycle cover. The union of the cycle covers form a Eulerian graph. 
Replace edges in the union with a shortcut edge in the original complete graph 
to obtain a TSP tour. Since the graph respects the triangle inequality the TSP 
tour has weight less than or equal to the the sum of the cycle covers. 

The description of the algorithm by Frieze et al. [3] leaves some implementa- 
tion details unspecified. The algorithm chooses one arbitrary node in every cycle 
to be in the subgraph and shortcuts are made in arbitrary order. In our analysis 
of the worst case performance we make the following assumptions; 

1. The first node in every cycle is chosen for the subgraph. 

2. The shortcuts are made in a certain specified order. 

We have constructed a family of graphs which gives the algorithm by Frieze 
et al. a worst case performance. This shows that the analysis of the algorithm 
by Frieze et al. is tight and our main result is: 

Theorem 1. For every e (0 < e < 1/n), there exists a family of graphs, Gn, 
such that the approximation algorithm by Frieze et al. [3] with our specifications, 
gives a TSP tour, T such that 

T log 2 n 

opti^Gjf) 2 2e 

For asymmetric graphs Frieze et al. [3] give another data dependent algorithm 
which gives a 3o;/2-approximation of the TSP tour, where a is the maximum 
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ratio of d{vi,Vj)/d{vj,Vi) for Vi,Vj € V, Vi ^ vj. The idea of the algorithm is 
to make the graph symmetric and then use the algorithm for symmetric graphs 
by Christofides [2]. We will not show a worst case behaviour of this algorithm, 
we just make sure that the class of graphs which has a worst case behaviour of 
the ordinary algorithm is not guaranteed to be well-approximated by the data 
dependent algorithm. 

2 Notations and Conventions 

All graphs in this paper have n = 2™ nodes placed in a circle as in Fig. la. 
When an algorithm operates on an arbitrary node the ordering is modulo 2™. 
For example the node before vo is Difference in index for two nodes Vi 

and Vj is min{|f — j\,n — \i — jj}. An interval of node-indexes [a, b] represents if 
a < 5 all numbers a < i < b and if & < a (the complement of [6, a]) U {a, b}. 

Definition 2. For an m-bit integer x the we define the function 

Zm{x) = max{fc G Zm I 2* divides a;} 

In particular Zm(0) = m — 1 since all numbers divide zero. 

In words, Zm{i) is, for i yf 0, the position of the least sigificant non-zero bit 
in the binary representation of i. 

2.1 Constructing a Distance Function 

Given a strongly connected graph, G = {V, E) with weighted edges, we define 
the distance between two nodes, d{vi,Vj), as the weight of the shortest path in 
G from Vi to Vj . This distance function clearly obeys the triangle inequality. 

2.2 Some Terminology 

In this paper we often discuss parts of graphs; we therefore introduce some terms 
describing such parts. 

Definition 3. A cycle cover for a directed graph G = (V, E) is a subgraph of G 
such that for each node v €V, indegree{v) = outdegree{v) = 1. A cycle cover 
where every cycle has exactly two nodes is called a 2-cover. 

Definition 4. A directed cactus is a strongly connected, asymmetric graph 
where each edge is contained in at most (and thus, in exactly) one simple di- 
rected cycle [9]. A spanning cactus for an asymmetric graph G is a subgraph of 
G that is a directed cactus and connects all vertices in G. 

Notations. Throughout the paper, T is a TSP tour, opt{G) is total weight of 
the minimum TSP tour in the graph G, and C is a cycle cover or a cactus. By 
cycle we mean simple cycle and a cycle is denoted by the nodes in it; for example 
(vi,Vj,Vk) is the directed cycle from Vi to vj to Vk and back to Vi. 




468 



A. Palbom 




Fig. 1. Left (a) : The graph inducing the distance fnnction, diQ{vi,Vj). All edges have 
weight one. Right (b): The Worst Case Spanning Cactus (W-cactus) in the graph 
Gig. Each 2-cycle is symbolised with an edge. 

3 The Approximation Algorithm 

An intuitive description of the approximation algorithm by Frieze et al. [3] is 
given in the introduction. Their description of the algorithm leaves some imple- 
mentation details unspecified which we now specify. 

To get an upper bound on their algorithm Frieze et al. make the following 
analysis: In the worst case all cycles in the cycle cover have length two, hence 
are [log 2 nj cycle covers at most produced. The weight of every cycle cover is 
less than or equal to opt(G). Thus the spanning cactus formed by the union of 
the cycle covers has weight at most opt{G) ■ log 2 n. Since the graph obeys the 
triangle inequality the TSP tour found from the spanning cactus is shorter than 
or equal to opt{G) ■ log 2 n. 

In order to analyse the algorithm we need to specify the arbitrary choices in 
the description by Frieze et al.: 

1. An arbitrary node from every cycle in a cycle cover is chosen to be in the 
next subgraph. We choose the node with lowest index. 

2. The shortcuts made to transform the spanning cactus to a TSP tour are in 
arbitrary order. We use our procedure SHORTCUT below which makes the 
shortcuts in a specific order. 

SHORTCUT(G, C) DEPTH-FIRST(G, C, s) 

Input: A graph G = (U, E) Input: A graph G = (U, E) 

A cactus G C E A cactus G G E 

The present node s 

Output: A TSP tour T Output: The last node added, 

begin begin 

global T^0; t ^ s, U U U {s}; 

global set of visited nodes U ^ 0; V(s, v) £ G : v ^ U do 
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t ^ DEPTH-FIRST (G, G, t>o); 
T ^ T U (i, Vo); 
return T; 
end 



T ^ T U (t, w); 
t ^ DEPTH-FIRST(G, G, v); 
end 

return t; 
end 



SHORTCUT makes a depth-first search in G starting at the node vq and con- 
nects the nodes in the order they are found. If there, in the procedure DEPTH- 
FIRST, are several edges (s,Vj) G G such that Vi ^ U from a node s, edges to 
nodes with higher index i are traversed first. The algorithm with these specifi- 
cations is called ATSPS. 

A straightforward analysis shows that the tour produced by the procedure 
SHORTCUT is a TSP tour. Since the algorithm is independent of the distance 
function the TSP tour only depends on the structure of the spanning cactus. A 
comparison with the original algorithm by Frieze et al. shows the following: 

Lemma 1. The TSP tour produced by SHORTCUT on a spanning cactus, with 
exactly two nodes in every cycle can he produced by the algorithm by Frieze et 
al. on the same graph. 



4 A Worst Case Approximation 

In this section we construct a simple family of graphs to illustrate the algorithm 
by Frieze et al. and the main ideas of the worst case preformance. Since the 
graphs are symmetric they can be approximated within 3/2 by the algorithm 
due to Christofides [2]. 



4.1 Constructing the Graph 

The distance function is induced by a graph (Fig. la) defined as follows: 

Definition 5. The distance function, d],{vi,Vj), is induced by an undirected 
graph with n = 2™ nodes arranged in a circle. Adjacent nodes are connected 
by edges of weight one and there are no other edges in the graph. 

Definition 6. Let G\ he a complete, directed graph with n = 2™ nodes and the 
distance function d^ (vi ,vj). 

The distance between two nodes in is the difference in index; df{vi,Vj) = 
min{|f — j\,n— \i — j\}. The edges are directed even though they have the same 
distance in both directions. The minimum TSP tour is of course to traverse the 
nodes in clock-wise order (or counter-clock- wise) and opt{Glf) = n. 
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4.2 The Spanning Cactus 

The algorithm by Frieze et al. recursively finds a minimum cycle cover in the 
graph. In a complete, asymmetric graph, the union of all cycle covers recursively 
produced by the algorithm forms a spanning cactus. 

To get an intuitive understanding of the algorithm ATSPS and the worst 
case behaviour of we use a graph with n = 2 "^ = 16 nodes as an example 
(Fig. lb). In the first recursion the minimum cycle cover can consist of one large 
cycle or eight of weight two. Both have total weight 16. Assume that the 2-cover 
is chosen. Then the cycle cover consists of {(uo,wi), (v 2 ,V 3 ), (^ 4 ,^ 5 ), {vq,V 7 ), 
(vs,vg), (fio,uii), (ui 2 ,'Ci 3 ), (ui 4 ,ui 5 )}. Choose the first node in every cycle to 
be in the set of nodes for the next recursion. In our example this gives the nodes 
with even index. Now the shortest distance between any nodes in the subgraph 
G is two. Again the procedure can return one large cycle or four cycles of weight 
four. Both have the total weight 16 and we assume that the 2-cover is returned. 
Proceed in the same way until there is just one cycle in the cycle cover. The 
union of all 2-covers is called a W- cactus. 

Definition 7. For a graph G with n = 2™ nodes a Worst Case Spanning Cactus 
or a W-cactus is a subgraph of G such that E = {(ui, (vi_ 2 k,Vi) : Zm{i) = 
k}. 

For n = 16 nodes the W-cactus looks like in Fig. lb. It can be seen that a 
W-cactus in G), has weight nlogn. 

Lemma 2. For a W-cactus, G , in a graph G with n = 2™ nodes and an node 
s = Vi the procedure DEPTFl-FIRST returns Vi if i is odd and fj+i if i is even. 

Proof. Every node Vi in a W-cactus with odd index, i.e., such that Zm{i) = 0, 
is by Definition 7 in exactly one cycle (vi,Vi-i). Every node Vi in a W-cactus 
with even index, i.e., such that Zm{i) = fc > 1 , is in least two cycles (vi,Vi+i) 
and (vi,Vi_ 2 k). When DEPTH-FIRST is called with an odd node s = Vi as 
input there is no other cycle (s, v) € G and the procedure returns t = s. If the 
input node s = Vi is even there may be several cycles (s,v) € G but the one 
with smallest difference in index is (vi,Vi+i) and it is the last cycle selected in 
the loop. The node Vj+i ^ U since there is just one cycle connecting the odd 
node Vi+i with the rest of the cactus. After the recursive call to DEPTH-FIRST 
t <— Vi+i. Hence the procedure returns t = Vi+i in this case. 

To make the notation in some proofs clear we need the following definition: 

Definition 8. For the algorithm ATSPS and the graph Gn = Gn,o with n nodes, 
the subgraph remaining after the first cycle cover is found is denoted by Gn,i and 
the subgraph remaining after the i:th recursion is denoted by Gn,i. 

Since every cycle in the cycle cover is a 2-cycle of edges with equal weight, we 
visualize every 2-cycle as an undirected weighted edge. The union of the cycle 
covers can with this view be seen as a spanning, undirected tree. Since every 
node is in at least one cycle the tree is spanning and by the construction the 
cover is cycle-free. The view of the spanning cactus as a spanning tree directly 
gives that a W-cactus has n — 1 cycles. 
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4.3 The TSP Tour 

We proceed by analysing SHORTCUT. The procedure takes a spanning cactus 
and returnes a TSP tour. It is independent of the distance function and only 
considers the structure of the spanning cactus. Again we use the graph G\^ as 
an example to describe the procedure and the spanning cactus is a W-cactus. 
Initially the TSP tour T = 0 and the start node s = t vq. After the first call to 
DEPTH-FIRST T ^ (vo,vs)- After some recursive calls to DEPTH-FIRST the 
graph will look like Fig. 2(a). Then U = {wo, ^^87 i'i2, 'Ci4, ^15, U13, ?;io, wii}, and 
T contains the cycles {vq.vs), (t'8,vi2), {vi2,via), (ui 4 ,-ci 5), (i’i3,wio), 

(^^10: ■I’ll) added in that order and t = Vn. In the next step T <— TU (iin, wg). 



l+3e vq 




Fig. 2. Left (a): The TSP tour after some steps with the procedure SHORTCUT 
on a W-cactus with n = 2^ = 16 nodes. Undirected edges represent 2-cycles in the 
W-cactus and arrows represent edges in the TSP tour. Right (b): A graph with 
induced distance function. The solid lines are edges in Gig and dashed lines are edges 
with induced distances. For simplicity weights equal to one are omitted and only some 
induced edges are shown. 



It can be seen that the TSP tour produced by ATSPS with the assumtion 
that small cycles are prefered in the cycle cover, on the graph has weight 
larger than inloggU-. Since the optimum TSP tour has length n, this gives an 
approximation in f?(logn). 



5 A Worst Case Preformance 

In this section we construct the family of graphs that gives the worst case pre- 
formance of the algorithm ATSPS. The graphs in the previus section have two 
main disadvantages: That they are symmetric and hence can be approximated 
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by the algorithm due to Christofides and that the minimal cycle cover is not 
unique. The graphs defined in this section do not have these disadvantages. 
Figure 2(b) shows a graph defined as follows: 

Definition 9. The distance function, dn{vi,Vj), is induced by a graph 
with n = 2"^ nodes arranged in a circle. Edges {vi-i,Vi) in have weight 
w{vi-i,Vi) = 1 + Zm{i)^ where 0 < e < 1/n. Edges Gff with 

Zm{i) = k < m — 1 have weight w{vi, Vj_ 2 »=) = 2^ + (2* — l)e. 



Definition 10. Let G„ be a complete, asymmetric graph with n = 2™ nodes 
and distance function dn{vi,Vj). 

The distance between Vi and Vj in G„ is equivalent to the edge weight of 
(vi,Vj) in whenever such an edge weight is specified in Definition 9. In fact, 
it also holds that d{vi, Vi_ 2 k ) = 2^ + (2'^ — l)e even when k = m—l. We use this 
in several proofs below. 

The graph is clearly asymmetric and the maximum ratio between edges in 
different directions is linear in n if e < 1/ log 2 n and n > 4. Frieze et al. show 
in their analysis of their data dependent algorithm that the approximation is in 
0{a). Hence, a graph G„ at least is not proven to be easily approximated by 
the data dependent algorithm. 

Lemma 3. In a graph G„ the optimum TSP tour has weight n + (n — 2)e. 

Proof. The minimum TSP tour, opt{Gn), is to traverse the nodes in clockwise 
order. Every edge has weight at least one which gives a weight of n. The “extra” 
weight is 

log2 n log2 n 

£ (”J) - e = £ (^)) - e = (^ - 2)e 

i=l i=l 

and the total weight is n + (n — 2)e. 

Is the tour minimal? There are n edges in G„ of weight one. Only half of 
them can be in a TSP tour since they have opposite direction. There are n/2 
edges of weight less than two, all induced edges have length greater than 2. The 
TSP tour consists of the n shortest edges possible in a TSP tour and is hence 
minimal. □ 

For the following two lemmas we need a simpler distance function: 

Definition 11. The distance function d^{vi,Vj) is induced by a symmetric 
graph G^^ with n = 2™ nodes arranged in a circle. Adjacent nodes are con- 
nected by an edge. The weight of an edge is w{vi-\,Vi) = 1 + Zm{i)(- where 
0 < e < 1/n. Let G^ be a complete, directed graph with n = 2™ nodes and the 
distance function d^{vi,Vj). 
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The distance function in is a “symmetrised” version of the distance func- 
tion in G„. The distance of (vi,Vi-i) in is equal to the minimum of the 
weights of (vi,Vi-i) and in (non-existent edges in G^ are here 

given weight oo). Edges in G^ with Zm(t) = fc < m — 1 are assigned 

to the induced distance d(vi,Vi_2k) = 2^ + (2^ — l)e which is the minimum of 
the weights of {vi,Vi_ 2 k) and in G^. Hence edges in G^ are equal to 

or shorter than edges in G„. If a cycle cover is minimal in Gf and all edges in 
the cycle cover have the same distance in Gf and in G„ then the cycle cover is 
minimal in G„ as well. 

Lemma 4. For j < m and n = 2™- it holds in Gf that d{vi_2i ,Vi) = 2^ + e ■ 
(2-1 -1-1- Zmii ^ j))- Here ^ denotes a bitwise shift to the right padded with 
zeros on the left, i.e., i^ j = \i ■ 2 ~^ \ 

Proof. Consider the edge {vi_2J,Vi) in Gf. The path of edges [i — 2^, f] in G^^ 
has weight 



d{v,_23,Vi)= ^ {I + z^{t)e) = 2^ + e ■ ^ 

t=i-2i + l t=i-2i+l 

When the j least significant bits in t are zero Zm{t) = Zm{t ^ j) + j- The 
remaining terms sum up to 



2^-1 

^ Zm{t) = 2^ -1-j 
t=l 

If j = m—1 the path [i,i — 2^] has equal weight to the path [i — 2Gi\. If j < m — 1 
the path [i,i — 2^] has weight larger than 2™“^. Since e <ljn this is larger than 
the weight of the path [i — 2^,i]. Hence the path \i — 2 ^ , i] is minimal and therefore 
induces the distance between Vi_ 2 i and Vi in Gf . □ 

An edge {vi_ 2 i , vf) with fixed value of j has minimum distance if it is in the 
W-cactus since edges in the W-cactus has Zm{i) = j which gives ^ j) = 0 
in the lemma above. 

Lemma 5. In the graph G„ the algorithm ATSPS produces a W-cactus as span- 
ning cactus and it has weight at least nlog 2 n. 

Proof. First we show by induction that a minimum cycle cover in Gf is a W- 
cactus and has the desired weight. Then we show that the same cover exists in 
G„.In the beginning Gf q consists of all n nodes Vi and Zm(*) > 0. Every other 
edge has length one and every other edge has at least one extra e-distance added 
and the 2-cover is the unique minimal cycle cover. The edged in the cycles are 
{vi,Vi- 2 '^),{vi- 20 ,Vi) and the index i is odd or Zm{i) = 0. If the first node in 
every cycle is put in the subgraph, Gf consists of nodes Vi with even index 
i : Zm(*) > 1- Suppose G^ ^ consists of all nodes vi with Zm{i) > j and that the 
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cycle covers in ,, for r < j form a subgraph of the W-cactus. Every other edge 
has by Lemma 4 distance 2^ + (2^ — l)e and every other has by at least one extra 
e-distance added. The 2-cover is minimal. Select the first node in every cycle to 
be in Then the cycle cover in Gf ^ is a subgraph of the W-cactus and 

nodes Vi in have Zm{i) > J + 1. By induction the 2-cover is minimal for 

every subgraph and it forms a W-cactus. 

If there were no e- weights the cactus would have weight nlog 2 n. Since all 
edges in the minimum cycle cover have the same weight in G„ the W-cactus is 
a minimum cycle cover in G„ as well. □ 

Now we have a W-cactus as spanning cactus. The procedure SHORTCUT 
makes a TSP tour from the cactus. 

Lemma 6. In the graph Gn the approximation algorithm ATSPS gives a TSP 
tour of weight greater than (nlog 2 n)/ 2 . 

Proof. For a graph G„ the procedure ATSPS gives by Lemma 5 a W-cactus with 
weight greater than nlogn as spanning cactus. The procedure SHORTCUT does 
not depend on the distance function. Hence if the spanning cactus is a W-cactus 
SHORTCUT will always return the same TSP tour. We show that the TSP tour 
in G^ with a W-cactus as spanning cactus has the desired weight and since every 
edge in G„ has at least the same weight as in G^ the TSP tour in G„ must have 
at least the same weight. 

From the proof of Lemma 5 the algorithm ATSPS given G^, produces a 
W-cactus of weight greater than n log 2 n. To show that the TSP tour produced 
by the algorithm has weight larger than half the weight of the W-cactus, we 
construct an injective function from the cycles in the cactus to the edges in the 
TSP tour such that each edge in the TSP tour has higher or equal distance than 
the longest edge in the corresponding cycle. 

A cycle in the W-cactus is mapped to an edge in the TSP 

tour such that either t = i or t = j + (j — i) + l. The value of t is determined by 
the order in which edges are added to the TSP tour in SHORTCUT. Suppose 
DEPTH-FIRST is called with some even node s. The first cycle (s,v) € G 
processed in the loop is mapped to the edge (s,v). For the remaining iterations 
in the loop, the cycle (s, v) is mapped to the edge (t, v) where t was obtained 
from the call to DEPTH-FIRST in the previus iteration of the loop. 

The mapping is obviously injective since DEPTH-FIRST visits every node 
exactly once. The first cycle is mapped to one edge in the cycle. If there are 
several cycles (s,v) G G in DEPTH-FIRST, s is even. Consider the cycle 
(vi,Vj) = (ui,Ui_|_ 2 fe) which is not the first cycle chosen. The node v = Wj_|_ 2 fc+i 
was sent to DEPTH-FIRST in the previous recursion. Since i + 2^+^ is even 
t ^ i + 2^^+^ -I- I was returned by Lemma 2. Hence, the cycle is mapped to 
the edge (vt,Vj) = (uj_|_ 2 fc+i+i, The difference between the indices of the 

nodes in this edge is min{|t — j\,n — \t — j\} = 2^4-1 which is greater than the 
corresponding differnce for the cycle {vi, Vj) since min{|f — j|, n — |i — jj} = 2*. 
Therefore Lemma 4 implies that d{vt,Vj) > d{vi,vj). 
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Thus for every cycle there is an associated edge with length at least as high as 
the edges in the cycle and the TSP tour has weight at least half of the W-cactus 
in Gf. □ 

By combining Lemma 6 and Lemma 3 we have proved Theorem 1. 



6 Conclusions and Future Work 

The graphs, G„, form a family of asymmetric graphs for which the approximation 
algorithm for asymmetric TSP by Frieze et ah, with our specifications, shows a 
worst case behaviour. The algorithm returns by Theorem 1 a TSP tour of weight 
greater than (opt(G„) • log 2 n)/{2 + e), e > 0. The analysis of the algorithm by 
Frieze et al. shows that the algorithm gives a TSP tour with weight less than 
or equal to opt(G„) • log 2 n, hence the analysis of the algorithm by Frieze et al. 
is tight up to a factor of 1/2. One improvement of the algorithm might be to 
make the choices data dependent. It would also be interesting to investigate the 
average behaviour of the algorithm with random choices. 

The ratio a between edges in different directions is greater than n/2 and 
the data dependent approximation algorithm by Frieze et al. is then proven to 
give an approximation better than 3a/2 = 3n/4 or 0(n). When we apply the 
data dependent algorithm to G„, there are two possible outcomes. The algo- 
rithm converts the asymmetric graph to a symmetric, uses the algorithm due 
to Christofides [2] and in the end arbitrarily chooses the direction of the found 
TSP tour. The undirected TSP tour is around the circle. If the direction is cho- 
sen to be clockwise the algorithm finds the optimum TSP tour of weight n. If 
the direction on the other hand is chosen to be anti-clockwise the directed cycle 
has weight n log 2 n which is the same as for the original approximation algo- 
rithm. Thus with one choice assumed to be bad the data dependent algorithm 
approximates the asymmetric TSP tour within a factor of log 2 n. The expected 
approximation is (log 2 n)/2 over the choice of orientation of the tour. 

The new algorithm by Blaser is a development of the algorithm by Frieze et 
al. and is more complicated. When we apply it to our graphs G„ it can return 
different TSP tours. It turns out that it is possible to specify Blaser’s algorithm 
in such way that it, for the graph G„, returns the same TSP tour as the algorithm 
by Frieze et al. Hence, Theorem 1 applies also to Blaser’s algorithm. 

In their very recent algorithm Kaplan et al. [4] introduce some new ideas. Ba- 
sically, they compute a fractional cycle cover with certain 2-cycle constraints and 
use such covers to extract cycle covers with few 2-cycles for the underlying graph. 
A very interesting direction for future research is to investigate if the graphs G„ 
defined in this paper — or any other class of graphs — give an approximation ratio 
Omega(log n) for this new algorithm. 

The question by Karp [6] as to whether there is a polynomial time heuristic 
for which the approximation ratio of asymmetric TSP is bounded by a constant 
is still open. 
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On Visibility Representation of Plane Graphs* 
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Abstract. In a visibility representation (VR for short) of a plane graph 
G, each vertex of G is represented by a horizontal line segment such that 
the line segments representing any two adjacent vertices of G are joined 
by a vertical line segment. Rosenstiehl and Tarjan [11], Tamassia and 
Tollis [14] independently gave linear time VR algorithms for 2-connected 
plane graph. Recently, Lin et. al. reduced the width bound to J 

[10]. In this paper, we prove that any plane graph G has a VR with width 
at most 

For a 4-connected plane triangulation G, we give a visibility represen- 
tation of G with height at most [^]. In order to show that, we first 
show that every such graph has a canonieal ordering tree with at most 
leaves instead of the previously known bound [ J , which is of 
independent interest. All of them can be obtained in linear time. 



1 Introduction 

A visibility representation (VR for short) of a plane graph G is a representation, 
where the vertices of G are represented by non-overlapping horizontal segments 
(called vertex segment), and each edge of G is represented by a vertical line 
segment touching the vertex segments of its end vertices. A simple linear time 
VR algorithm was given in [11,14] for a 2-connected plane graph G. It only uses 
an st- orientation of G and the corresponding st-orientation of its dual G* to 
construct the VR. 

One of the main concerns afterwards for VR is the size of the representation, 
i.e., the height and width of VR. (The height and width of VR is the height and 
width of the rectangle in which the VR is drawn into.) Recently, using a more 
sophisticated greedy algorithm, Lin et. al. reduced the width bound to ^ 

In this paper, we prove that every plane graph G has a VR with width at most 
|^ i3w-24 j using the simpler algorithms from [11,14]. 

Canonical ordering tree of plane graphs is another important concept. In 
many applications of canonical ordering tree T, the number of leaves of T is a 
crucial parameter. It is known that there exists a plane graph for which every 
canonical ordering tree has at least J leaves. 

We explore the properties of regular edge labeling (REL for short) of a 4- 
connected plane triangulation G and prove: Every 4-connected plane triangula- 
tion G has a canonical ordering tree with at most leaves. Applying this 
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result, we show that: Every 4-connected plane graph G has a VR with height at 
most 1"^]. This improves the previously known bound |"^1 [15]. 

We summarize related previous results and our results in the following table: 



References 


Plane graph G 


4-Connected plane graph G 


[11,14] 


Width of VR < (2n - 5) 


Height of VR < (n — 1) 


[7] 


Width of VR < 




[10] 


Width of VR < \ 




[8] 




Width of VR < (n — 1) 


[15] 


Height of VR < 




This paper 


Width of VR < 


Height of VR < 



The present paper is organized as follows. Section 2 introduces preliminaries. 
Section 3 presents the construction of a VR with width at most [ 13^-24 j ^ gg^tion 
4 explores the properties of REL of a 4-connected plane triangulation G and 
proves: Every 4-connected plane graph G has a VR with height at most 
It also proves that: Every 4-connected plane triangulation G has a canonical 
ordering tree with at most leaves. We refer the readers to the technical 

reports 2003-06 and 2003-10 of CSE department at SUNY at Buffalo for omitted 
details. The authors are grateful to the anonymous referees for helpful comments. 



2 Preliminaries 

In this section, we give definitions and preliminary results. G = (V,E) denotes 
a graph with n = \V\ vertices and m = \E\ edges. An embedding of a plane 
graph divides the plane into a number of two dimensional regions, called faces. 
The unbounded region is the exterior face. Other regions are interior faces. The 
vertices and the edges on the exterior face are called exterior vertices and exterior 
edges. Other vertices and edges are interior vertices and interior edges. A path 
P of G is a sequence of distinct vertices ui,U2, . . . ,Uk such that {ui, Ui+i) G E 
for 1 < i < fc. Furthermore, if (uk,ui) € E, then u\,U 2 , . . . ,Uk is called a cycle. 
We normally use G to denote a cycle and the set of the edges in it. If G contains 
k vertices, it is a k-cycle. A triangle {quadrangle, resp.) is a 3-cycle (4-cycle, 
resp.) A cycle G of G divides the plane into its interior and exterior regions. 
If G contains at least one vertex in its interior region, G is called a separating 
cycle. If all facial cycles of G are triangles, G is a plane triangulation. G is called 
a directed graph (digraph for short) if each edge of G is assigned a direction. 
We abbreviate the words “counterclockwise” and “clockwise” as ccw and cw 
respectively. 

The dual graph G* = (V*,E*) of a plane graph G is defined as follows: For 
each face E of G, G* has a node vp. For each edge e in G, G* has an edge 
e* = {vfi,vf 2 ) where Pi and E 2 are the two faces of G with e on their common 
boundaries, e* is called the dual edge of e. For each vertex v gV, the dual face 
of v in G* is denoted by v*. 

Let G be a 2-connected plane digraph with two specified exterior vertices 
s and t. G is called an st-plane graph if it is acyclic with s as the only source 
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and t as the only sink. The properties of st-plane graphs were studied in [9,11]. 
In particular, for every face / of G, its boundary cycle consists of two directed 
paths. The path on its left (right, resp.) side is called the left (right, resp.) path 
of /. There is exactly one source (sink, resp.) vertex on the boundary of /, it is 
called the source (sink, resp.) of /. 

An orientation of a graph G is a digraph obtained from G by assigning a 
direction to each edge of G. We will use G to denote both the resulting digraph 
and the underlying undirected graph unless otherwise specified. (Its meaning will 
be clear from the context.) For a plane graph G and an exterior edge (s,t), an 
orientation of G is called an st- orientation if the resulting digraph is an st-plane 
graph. Note: For an st-orientation, we require that (s, t) is an exterior edge. This 
is the difference between an st-orientation and an st-plane graph. 

Let G be a 2-connected plane graph and (s,t) an exterior edge. An st- 
numbering of G is a one-to-one mapping ^ : V — >■ {1,2, such that 

^(s) = 1, 5(f) = n, and each vertex v ^ s,t has two neighbors u,w with 
£,{u) < ^(v) < f(w), where u (w, resp.) is called a smaller neighbor (bigger 
neighbor, resp.) of v. Given an st-numbering ^ of G, we can orient G by direct- 
ing each edge in E from its lower numbered end vertex to its higher numbered 
end vertex. The resulting orientation is called the orientation derived from f, 
which, obviously, is an st-orientation of G. On the other hand, if G = (F, E) 
has an st-orientation O, we can define an 1-1 mapping f : V — >■ (1, • • • , n} by 
topological sort. It is easy to see that f is an st- numbering and the orientation 
derived from f is O. From now on, we will interchangeably use the term an 
st-numbering of G and the term an st-orientation of G, where each edge of G is 
directed accordingly. 

Given an st-orientation O of G, consider the dual graph G* of G. For each e € 
G, we direct its dual edge e* from the face on the left of e to the face on the right 
of e when we walk on e along its direction in O. We then reverse the direction 
of (s,t)*. It was shown in [11,14] that this orientation is an st-orientation of G* 
with (s,t)* as the distinguished exterior edge. We denote the source by s, and 
the sink by t. When we embed G and G* on plane simultaneously, we fix t* to 
be the exterior face of G*. We will denote this orientation of G* by O* and call 
it the eorresponding st- orientation of O. 

Lempel et. al. [9] showed that for every 2-connected plane graph G and an 
exterior edge (s, t), there exists an st-numbering. The following lemma was given 
in [11,14]: 

Lemma 1. Let G be a 2-connected plane graph. Let O be an st- orientation of 
G. A VR of G can be obtained from O in linear time. The height of the VR is 
the length of the longest directed path in O. The width of the VR is the length 
of the longest directed path in the corresponding st- orientation O* of G* . 

Let G be a plane triangulation with n > 3 vertices and m = 3n — 6 edges. Let 
vi,V 2 , ■ ■ ■ ,Vn be an ordering of the vertices of G where vi,V 2 , Vn are the three 
exterior vertices of G in ccw order. Let Gk be the subgraph of G induced by 
Vi,V 2 , ■ ■ ■ ,Vk and Hk the exterior face of G^. Let G — Gk he the subgraph of G 
obtained by removing vi,V 2 , - ■ ■ ,Vk. 
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Definition 1. [4] An ordering vi, • • • , of a plane triangulation G is canonical 
if the following hold for every fc = 3, • • • , n; 

1. Gk is 2- connected, and its exterior face Hk is a cycle containing the edge 
{V1,V2). 

2. The vertex Vk is on the exterior face of Gk, and its neighbors in Gk-i form a 
subinterval of the path Hk-i — {vi,V 2 ) with at least two vertices. Furthermore, 
if k < n, Vk has at least one neighbor in G — Gk- (Note that the case k = 3 
is degenerated, and H 2 — (vi,V 2 ) is regarded as the edge (vi,V 2 ) itself.) 

A canonical ordering of G can be viewed as an order in which G is recon- 
structed from a single edge (vi,V 2 ) step by step. At step k, when Vk is added 
to construct Gk, let cp c;+i, • • • , be the lower ordered neighbors of Vk from 
left to right on the exterior face of Gk-i. We call (vk,ci) the left edge of Vk. 
The collection T of the left edges of the vertices Vj for 3 < j < n plus the edge 
(ui, W 2 ) is a spanning tree of G and is called a canonical ordering tree of G [4,6]. 

3 Compact Visibility Representation of Plane Graphs 

We assume that G is a 2-connected plane graph for now. First, we introduce 
several concepts and a technical theorem whose proof is omitted. 

Definition 2. Let G be a 2-connected plane graph, O be an st- orientation of G. 
Let G* be the dual graph of G, O* be the corresponding st- orientation of O. 

1. For any vertex v ^ s,t of G, define: Fland{v) = {(w,m) & E \ u is a bigger 
neighbor of v}; Foot{v) = {(u, f) & E \ u is a .smaller neighbor of u}. For 
v = s, define Foot{s) = {(s,t)}, Hand{s) = {(s,u) & E\ u ^ t}. (Note, 
the two definitions for the source s are special.) 

2. For any face v* yf t* of G* , define: cover{v*) = {e* \ e G Hand{v)}; 
sheet{v*) = {e* \ e G Foot{v)}. Note that, for any face v* yf t* of G* 
(including s*), its boundary consists of two directed paths, one is cover{v*), 
the other is sheet{v*). 

3. For any vertex v ^ t of G, define scoreo{v) = min{|iton(i(f)|, \Foot{v)\}, 
and Scoreo{G) = scoreo{v). 



Theorem 1. Let G be a 2-connected plane graph with an st- orientation O. Let 
O* be the corresponding st- orientation of G* . Then G has a VR with width at 
most \E\ — Scoreo{G). 

Thus, in order to shorten the width of VR of a plane graph G, we need to 
find an st-orientation G of G such that Scoreo{G) is as large as possible. 

Without loss of generality, we assume that G is a plane triangulation with 
n > 4 vertices in the rest of this section. (Otherwise, we triangulate it into a 
plane triangulation G' by adding edges, a VR of G can be obtained from a VR 
of G' by deleting the vertical line segments representing the added edges.) First 
we need to introduce the concept of Schnyder’s realizer [12,13]: 
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(a) (b) 

Fig. 1. (a) A plane triangulation G and one realizer TZ of G; (b) A VR of G. 



Definition 3. Let G be a plane triangulation with three exterior vertices V\^V 2 , 
Vn in ccw order. A Schnyder’s realizer (realizer in short) TZ of G is a partition 
of the interior edges of G into three sets Ti,T 2 ,Tn of directed edges such that 
the following hold: 

— For each i € {l,2,n}, the interior edges incident to Vi are in Ti and directed 
toward Vi . 

— For each interior vertex v of G, the neighbors ofv form six blocks Ui,Dn, U 2 , 
Di,Un, and D 2 in ccw order around v, where Uj and Dj (j = 1,2, n) are 
the parent and the children ofv in Tj. 

It was shown in [12,13] that every plane triangulation G has a realizer TZ, 
which can be obtained in linear time. Each Ti {i € {l,2,n}) is a tree rooted at 
the vertex Vi containing all interior vertices of G. Fig. 1 (a) shows a realizer of 
a plane triangulation G. Three trees T\, T 2 , Tn are drawn as solid, dashed, and 
dotted lines, respectively. (Ignore the small boxes containing integers for now. 
Their meaning will be explained later.) 

The following lemma shows how to obtain st-numberings from a Schnyder’s 
realizer [15]. 

Lemma 2. Let G be a plane triangulation and TZ = {Ti,T 2 ,T„} a Schnyder’s 
realizer of G. T' be the tree obtained by Ti plus the two exterior edges adjacent 
to Vi in G. T' is rooted at Vi. Then the ccw preordering of the vertices of G with 
respect to T' is an st-numbering of G. 

For example, consider the tree T\ (rooted at fi) shown in Fig. 1. The union 
of Ti and the two exterior edges {v 2 ,vi) and {vn,vi) is a tree of G, denote it 
by T[. The ccw preordering of the vertices of G with respect to T[ are shown in 
integers inside the small boxes. It is an st-orientation of G by Lemma 2, denoted 
by Oi- Similarly, we have two other st-orientations O 2 , On- 

Denote the set of interior vertices of G by I. Then for each vertex v G I, 
scoreOi{v),i = 1,2, n is always definable. And obviously ScoreOi(G) = 2 + 
scoreOiiv). We denote scoresumiv) = J2i=i 2 n saoreo.{v) for each v € I. 

Next, we want to find a lower bound of 2 n Scoreo^{G). 
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Let inter{v) = X)i=i 2 nb *■5 "not a leaf of Ti], where [c] is 1 (0, resp.) if 
condition c is true (false, resp.). Lin et. al. partitioned the interior vertices of G 
into three subsets A, B, C as follows [10]: 

A = {rij inter{v) = 0}; 

B = {v| inter{v) = 2, deg{v) = 5}; 

C = {v ^ B\ interfu) > 1}. 

Let be the number of internal (namely, non- leaf) vertices in Ti. An interior 
face / of G is cyclic with respect to TZ if each of its three edges belongs to different 
trees of TZ. Denote the number of cyclic interior faces with respect to TZ by A{TZ). 
For example, in Fig. 1 (a), the faces {5,7,6}, {7,9,8}, {9,12,11} (marked by 
empty circles) are the cyclic faces in TZ. So, A{TZ) = 3. 

The following results were proved in [1,10]: 

Lemma 3. Let G he a plane triangulation with n > 4 vertices. Let vi,V 2 ,Vn be 
the exterior vertices of G in the ccw order. Let TZ = {Ti,T 2 ,T„} he any realizer 
of G, where Ti, i € {l,2,n} is rooted at Vi respectively. Let k be the number of 
connected components of the graph G[B], which is a subgraph of G induced by 
B. 

1. Cl + ^2 + Cn ~ ^(^) = n — 1. 

2. Cl + 6 + Cn - 3 = interiy) > 2\B\ + \G\. 

3. scoresum{v) > 3 -I- 2 • inter (v) — [v € B], v € I. 

I \B\-k< 2A{TZ). 

Now we can prove the following theorem: 

Theorem 2. Let G he a plane triangulation with n > 4 vertices, R = {T\,T 2 , 
T„} he any Schnyder’s realizer ofG. Then 2 n ~ 

Proof. Let G[B] be the subgraph of G induced by B. Suppose that G[B] has k 
connected components. Let \B\ — k = SA{TZ), then we have 0 < <5 < 2 by Lemma 
3 (4). Let Bt, t = 1, 2, • • • , A: be all the connected components of G[B]. Lin et. 
al. observed that [10]: any two distinct vertices of A are not adjacent in G; and 
each vertex in A is adjacent to at most one Bf. Thus, we know that the number 
of the connected components of G[AVJB] is at least k. Considering G — {A\J B), 
each interior face of G — (A U i?) contains at most one connected component of 
G[B]. We remove edges of G — (A U i?) until each interior face contains exactly 
one connected component of G[B], denote this graph by G'. (If G[B] is empty, 
G' does not have interior face.) Let Fj (z = 3,4, • • •) be the set of interior faces 
of G' with i edges on its boundary. Thus, we have: 

00 

k = Y.\P^\ (1) 

i=3 

For any face in Fj, z > 4, we can triangulate it into z — 2 faces, then by 
applying Euler’s formula to the resulting graph, the number of its interior faces 
is at most 2(|G| -I- 3) — 5 = 2|G| -I- 1. Thus: 

00 

^(z - 2)|F,| < 2(|G| + 3) - 5 = 2|G| + 1. 

2 = 3 
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Therefore: 

' — 0 ^ 

E 2 (2) 

i=3 

Using Equation (1) and (2), we have: 

IC”! > - + 21^^41 + l^sl + “ 2 

Applying Lemma 3 (1) and (2) and above equation, we have: 

n + A{TZ)-4>2\B\ + \C\ > + 26A{TZ) - ^ + ^\F^\ + \F,\ + ^\Fs\. 

Thus, we have: 

^-k <n + A{TZ) - 25A{n) - ^ - \\F^\ - IF 5 I - ^-\Fe\ (4) 

Because any vertex in B has degree 5 in G, and an interior face of G' in F 3 
contains at least 1 vertex from B in G, so it contains at least 3 vertices from 
A U S in G. Similarly, an interior face of G' in F 4 contains at least 2 vertices 
from AUS in G. An interior face of G' in Fi for i > 5 contains at least 1 vertices 
from A U -B in G. Thus: 



3|F3| + 2|F4| + E|J^d<l^l + l^l- 

z=5 



Add this to Equation (2), we have: 

7 ^ QQ • 1 r 

-IF3I + 3IF4I + -IF5I + 31^6 l+E^I'^*! - 1^1 

i=7 

Combining it with Equation (1), we have: 

7 7 "a • 1 1 

2* < (21^31 + 3|F4| + -IF5I + 3|F6| + E ^1^*1) + (21^41 + iBsl + -iFgl) 

i=7 ^ 

<n~l + \\F4\ + \F,\ + ^\F,\ (5) 

Add Equation (4) and (5), and divide both sides by 6. We have: 

77 1 1 

fc < ^ + -A{TZ) - -SA{Tl) - 1 (6) 

Applying Lemma 3 (2), (3) and above equation, also note that 0 < 5 < 2, we 
have: 



E ScoreOiiG) = 6 + E scoresum{v) 
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> 6 + ^{3 + 2 • inter{v)} — \B\ 

V&I 

= 6 + 3(n - 3) + 2(n + A(TZ) - 4) - \B\ 

= 5n + 2A(n)-ll - \B\ 

= 5n + 2A{TZ) - 11 - {\B\ - k) - k 

> 5n + 2Zi(7^) - 11 - 5A{n) ~\n- + 1^5 A{n) + 1 

= ^ + (y - \5)A{n) - 10 > ^ - 10 (7) 

Theorem 3. Let G he a plane triangulation with n > 4 vertices, then G has a 
VR with width at most [ i3Tt-24 j ^ obtained in linear time. 

Proof. Applying Theorem 2, we have X)i=i 2 « (G) > ^ — 10. Thus, 
one of ScoreOiiG) > . Applying Theorem 1 and Lemma 1, the width 

of the VR is at most 3n — 6 — [ i4n-30 -| _ ^ i3n-24 j ^ jg ^^gy g^^ 

the running time is 0 (n). 

For example, Fig. 1 (b) gives a VR of G, using the st-numbering in Fig. 1 
(a). 

4 Fewer-Leaf Canonical Ordering Tree and Compact VR 
of 4-Connected Plane Triangulations 

In order to obtain a canonical ordering tree for a 4-connected plane triangulation 
with fewer leaves, we need another concept and a lemma from [5] as follows: 

Definition 4. Let G' he a plane graph with four vertices vw,vs^ve,vn in ccw 
order on its exterior face. A regular edge labeling (REL for short) of G' is a 
partition of the interior edges of G' into two subsets 81,82 of directed edges such 
that the following hold: 

1. For each interior vertex v, the edges incident to v appear in ccw order around 
V as follows: a set of edges in 81 leaving v; a set of edges in 82 entering v; 
a set of edges in 81 entering v; a set of edges in 82 leaving v. Each set is 
nonempty. 

2. All interior edges incident to vn are in 81 and entering vn. All interior 
edges incident to vw are in 82 and leaving vw ■ All interior edges incident 
to vs are in 81 and leaving vg. All interior edges incident to ve are in 82 
and entering ve. Each block is not empty. 

Lemma 4. Let G' he a plane graph with four vertices on its exterior face. G' 
has a REL if and only if the following conditions hold: (1) Every interior face 
of G' is a triangle and the exterior face of G' is a quadrangle; (2) G' has no 
separating triangles. A graph satisfying the two conditions in the above lemma 
will be called a proper triangulated plane (FTP for short) graph. 
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Fig. 2. (a) A 4-connected plane triangulation G and the PTP graph G' after deleting 
{vw, ve), (b) An st-numbering of G, (c) A VR of G. 




Fig. 3. The S-N net Gi and the W-E net G 2 for the PTP graph G' in Fig. 2 (a). 



Let G be a 4 -connected plane triangulation with three exterior vertices v\y, 
in ccw order. Delete the edge (v\y,ve)- Denote the new exterior vertex by 
vs and the resulting plane graph by G' . (See Fig. 2 (a) for an example.) G' does 
not have separating triangles, and it has four exterior vertices vw,vs,ve,vn in 
ccw order on its own exterior face. Thus, G' is a PTP graph and has a REL 
{Si, S2) according to Lemma 4 . 

We investigate the properties of the REL of G'. Denote by Gi the directed 
subgraph of G' induced by and the four exterior edges directed as vs — >■ 
vw,vw Ve, Vs — f ve, ve — >■ ve- Let Ei be the edge set of Gi. {Ei is the union 
of Si and the four exterior edges.) Then Gi is an st-plane graph with source vs 
and sink ve- Similarly, let G2 be the directed subgraph of G' induced by S'2 and 
the four exterior edges directed as vw vs,vs ^ ve,vw ve,ve ve- Let 
E2 be the edge set of G2. Then G2 is an st-plane graph with source vw and sink 
Ve- We will call Gi the S-N net and G2 the W-E net of G' derived from the REL 
(S'i,S'2). For example. Fig. 3 shows a REL with its derived S-N and W-E nets 
for the PTP graph G' shown in Fig. 2 (a). (Ignore the small boxes containing 
integers for now. Their meaning will be explained later.) 
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Consider the S-N net G\. For each edge e G E\, let left{e) (right{e), resp.) 
denote the face of Gi on the left (right, resp.) of e. Define the dual graph of Gi, 
denoted by G\ as follows. The node set of G\ is the set of the interior faces of 
Gi plus two exterior faces fw and Je, (which are obtained by “splitting” the 
exterior face of Gi). For each edge e £ Ei, there is a corresponding edge e* in 
G\ directed from the face left{e) to the face right{e). G\ is an st-plane graph, 
so G\ is also an st-plane graph with fw as the only source and fs as the only 
sink [9,11]. For each face / in Gi, define the upper left (upper right, resp.) edge 
of / to be the last edge of the left (right, resp.) path of /, the lower left (lower 
right, resp.) edge of / to be the first edge of the left (right, resp.) path of /. Note 
that these four edges are distinct. (This is because { 81 , 82 ) is a REL, so each of 
the two directed paths on the boundary of / contains at least two edges.) 

Similarly, we can define the dual G 2 for G 2 with the word “left” (“right”, 
resp.) replaced by the word “below” (“above”, resp.). G 2 is also an st-plane 
graph with source fs and sink /j\r. For each face / in G 2 , we similarly define the 
left above, left below, right above, and right below edge of /. These four edges are 
also distinct. 

For i = 1,2, assume Gi has ki faces. Thus G* has h + 1 nodes. By using 
topological sort, we can assign a distinct number from 0, 1, • • • , fci to each node 
of G\ such that, for any edge e in E\, the number assigned to the face left{e) is 
smaller than the number assigned to the face right{e). The faces fw and fE are 
numbered by 0 and k\, respectively. Similarly, we can assign a distinct number 
from 0, 1, • • • , ^2 to each node of G 2 such that, for any edge era. E 2 , the number 
assigned to the face below{e) is smaller than the number assigned to the face 
above(e). Such an assignment is called a consistent numbering of G^, i = 1, 2 [5]. 
They can be computed in linear time by using topological sort. 

Fix a consistent numbering of G\. Let fr {0 ^ r < k\) denote the face 
numbered by r. We define the i-th S-N separation path SNi to be the directed 
path in Gi such that the faces numbered by 0, 1, • • • , i — 1 are on its left and 
the other faces are on its right. (These paths were called path system in [5].) Gi 
has exactly ki S-N separation paths and all of them are directed from vs to vn. 
Note that SN^+i is obtained from SNj. by deleting the left path of the face fr 
and adding the right path of fr. 

The j-th W-E separation path WEj of G 2 is defined similarly. G 2 has exactly 
k 2 W-E separation paths and all of them are directed paths from vw to ve [5]. 

For example, the faces in Fig. 3 are consistently numbered for Gi and G 2 
(indicated by the integers in small boxes). The 4-th W-E separation path WE 4 
in G 2 consists of the vertices vw,j,e,d,b,VE. Note that there are 4 vertices 
(i, c, h. Vs) below it and 4 vertices (/, a, g, vn) above it. 

We have the following technical lemma. The proof is omitted: 

Lemma 5. Let G' be a FTP graph with n vertices. Let G\ be the S-N net and 
G 2 be the W-E net derived from a REL { 84 , 82 ) of G' . Suppose G\ has k\ faces 
and G 2 has ^2 faces. Then: 

( 1 ) kx -\- k 2 = n -\- 1 . 

(2) The i-th S-N separation path SNi has at least i — 1 vertices on its left and 
at least k\ — i vertices on its right in G\. The j-th W-E separation path WEj 
has at least j — 1 vertices below it and at least k 2 — j vertices above it in G 2 . 
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The next theorem shows how to get a canonical ordering tree from the S- 
N net and the W-E net respectively. We prove that one of the two canonical 
ordering trees has at most leaves. The proof is omitted: 





Fig. 4. (a) A canonical ordering tree of G from the S-N net in Fig. 3 (a), (b) A canonical 
ordering tree of G from the W-E net in Fig. 3 (b). 



Theorem 4. Let G be a ^-connected plane triangulation with three exterior ver- 
tices vw , ve,vn in ccw order. Delete the edge (vw,ve)- Denote the new exterior 
vertex by vs and the resulting graph by G' . Let {81,82) be a REL of G' . Let G\ 
be the S-N net and G2 be the W-E net derived from {Si, 82). Then the following 
statements hold: 

1. For each vertex v yf vn,vs in G\, select an outgoing edge in Gi. For vs, 
select an outgoing edge not leading to vw or ve- Then the set T\ of the 
selected edges is a canonical ordering tree of G. 

2. For each vertex v yf vw,ve in G2, select an outgoing edge in G 2 . For vw, 
select the edge {vw,ve)- Then the set T2 of the selected edges is a canonical 
ordering tree of G. 

3. One ofTi,T2 has at most leaves. Lt can be obtained in linear time. 

Fig. 4 (a) shows the tree described in the statement (1) for the S-N net in 
Fig. 3 (a). Fig. 4 (b) shows the tree described in the statement (2) for the W-E 
net in Fig. 3 (b). 

We omit the proof of the following theorem: 

Theorem 5. Every .^-connected plane graph G with n vertices has a VR with 
height at most which can be obtained in linear time. 

Fig. 2 (c) shows a VR of the graph G using the st-numbering in Fig. 2 (b). 
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Topology Matters: Smoothed Competitiveness of 
Metrical Task Systems* 
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Abstract. We consider metrical task systems, a general framework to 
model online problems. Borodin, Linial and Saks [3] presented a deter- 
ministic work function algorithm (WFA) for metrical task systems having 
a tight competitive ratio of 2n — 1. We present a smoothed competitive 
analysis of WFA. Given an adversarial task sequence, we smoothen the 
request costs by means of a symmetric additive smoothing model and 
analyze the competitive ratio of WFA on the smoothed task sequence. 
We prove upper and matching lower bounds on the smoothed competi- 
tive ratio of WFA. Our analysis reveals that the smoothed competitive 
ratio of WFA is much better than 0(n) and that it depends on several 
topological parameters of the underlying graph G, such as the maximum 
degree D and the diameter. For example, already for small perturbations 
the smoothed competitive ratio of WFA reduces to O(logn) on a clique 
or a complete binary tree and to 0(y/n) on a line. We also provide the 
hrst average case analysis of WFA showing that its expected competitive 
ratio is 0(log(D)) for various distributions. 



1 Introduction 

Borodin, Linial and Saks [3] introduced a general framework to model online 
problems, called metrical task systems. Many important online problems can be 
formulated as metrical task systems; for example, the paging problem, the static 
list accessing problem and the fc-server problem. 

We are given an undirected and connected graph G = (V,E), with node set 

V and edge set E, and a positive length function X : E ^ on the edges 
of G. We extend A to a metric S on G. Let S : V x V ^ be a distance 
function such that S{u,v) denotes the shortest path distance (with respect to A) 
between any two nodes u and u in G. A task r is an n- vector (r(ui), . . . , r{v„)) 
of request costs. The cost to process task r in node vt is r{vi) € U {oo}. 
The online algorithm starts from a given initial position sq G ^ Eind has to 
service a sequence S = (ti, . . . , Tr) of tasks, arriving one at a time. If the online 
algorithm resides after task Tt-i in node u, the cost to service task r* in node 

V is S(u,v) +rt(v); S(u,v) is the transition cost and rt{v) is the processing cost. 
The objective is to minimize the total transition plus processing cost. 

* Partially supported by the Future and Emerging Technologies programme of the EU 
under contract number IST-1999-14186 (ALCOM-FT). 
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Table 1. Upper bounds on the competitive ratio of WFA. 



Upper Bounds 



random tasks 0( + log(_D))) 

arbitrary tasks ■ (^^+log(D))) and o(^^n- +log(U)) 

/3-elementary tasks 0(/3 • + log(U))) 



Borodin, Linial and Saks [3] gave a deterministic online algorithm, known as 
the work function algorithm (WFA), for metrical task systems. WFA has a com- 
petitive ratio of 2n — 1, which is optimal. However, the competitive ratio is often 
an over-pessimistic estimation of the true performance of an online algorithm. 

Based on the idea underlying smoothed analysis [7], Becchetti et al. [2] re- 
cently proposed smoothed competitive analysis as an alternative to worst case 
competitive analysis of online algorithms. The idea is to randomly perturb, or 
smoothen, an adversarial input instance S and to analyze the performance of the 
algorithm on the perturbed instances. Let ALG[5] and opt [5], respectively, be 
the cost of the online and the optimal offline algorithm on a smoothed instance 
S obtained from S. The smoothed competitive ratio c of ALG with respect to a 
smoothing distribution / is defined as 



c := sup E 
S 



S^S 



ALG [5] 
opt[5] J 



We use the notion of smoothed competitiveness to characterize the asymp- 
totic performance of WFA. We smoothen the request costs of each task accord- 
ing to an additive symmetric smoothing model. Each cost entry is smoothed 
by adding a random number chosen from a probability distribution /, whose 
expectation coincides with the original cost entry. Our analysis holds for vari- 
ous probability distributions, including the uniform and the normal distribution. 
We use cr to refer to the standard deviation of /. Our analysis reveals that the 
smoothed competitive ratio of WFA is much better than its worst case compet- 
itive ratio suggests and that it depends on certain topological parameters of the 
underlying graph. 



Definition of Topological Parameters. In this paper, we assume that the 
underlying graph G has n nodes, minimum edge length Umin, maximum edge 
length C/max, and maximum degree D. Furthermore, we use Diam to refer to 
the diameter of G, i.e., the maximum length of a shortest path between any two 
nodes. Similarly, a graph has edge diameter diam if any two nodes are connected 
by a path of at most diam edges. Observe that diamUmin < Diam < diamUmax- 
We emphasize that these topological parameters are defined with respect to G 
and its length function A — not with respect to the resulting metric. 
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Table 2. Lower bounds on the competitive ratio of any deterministic online algorithm. 

Lower Bounds 
arbitrary tasks 

- existential O ^ + log(D))^ and oi^n- +log(D))^ 

“ nniversal Q ^ log(D)^ and diam ■ TsiiiL . 

/3-elementary tasks 12 (/3 • ( + l)) (existential) 



We prove several upper bounds; see also Table 1. 

1. We show that if the request costs are chosen randomly from a distribution 
/, which is non-increasing in [0, oo), the expected competitive ratio of WFA 
is 

In particular, WFA has an expected competitive ratio of 0(log(I3)) if cr = 
^^(Cfmin)- For example, we obtain a competitive ratio of 0(log(n)) on a clique 
and of 0(1) on a binary tree. 

2. We prove two upper bounds on the smoothed competitive ratio of WFA: 

o(S-(^+log(^))) and o(jn-%^^{^+\og{D)) 

For example, if ct = 0(C/min) and t/max/Omin = 0(1)) WFA has smoothed 
competitive ratio 0(log(n)) on any constant diameter graph and 0{y/n) on 
any constant degree graph. Note also that on a complete binary tree we 
obtain an 0(log(n)) upper bound. 

3. We obtain a better upper bound on the smoothed competitive ratio of WFA 
if the adversarial task sequence only consists of j3- elementary tasks. A task 
is /3-elementary if it has at most /3 non-zero entries. We prove a smoothed 
competitive ratio of 

0(/3- ^(^+log(0))). 

For example, if ct = 0(C/min) and t/max/tlmin = 0(1)> WFA has smoothed 
competitive ratio 0{(3log{D)) for /3-elementary tasks. 

We also present lower bounds; see Table 2. All our lower bounds hold for any 
deterministic online algorithm and if the request costs are smoothed according 
to the additive symmetric smoothing model. We distinguish between existential 
and universal lower bounds. An existential lower bound, say l7(/(n)), means 
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that there exists a class of graphs such that every deterministic algorithm has 
smoothed competitive ratio l7(/(n)) on these graphs. On the other hand, a 
universal lower bound l7(/(n)) states that for any arbitrary graph, every deter- 
ministic algorithm has smoothed competitive ratio l7(/(n)). Clearly, for metrical 
task systems, the best lower bound we can hope to obtain is I7(n). Therefore, if 
we state a lower bound of l7(/(n)), we actually mean I7(min{n, /(n)}). 

4. For a large range of values for Diam and D, we present existential lower 
bounds that are asymptotically tight to the upper bounds stated in 2. 

5. We also prove two universal lower bounds on the smoothed competitive ratio: 

log(D)^ and ^ min | diam, ^ diam ■ -|- ly 

Assume that C/max/Cfmin = 6*(1)- Then, the first bound matches the first 
upper bound stated in 2 if the edge diameter diam is constant, e.g., for a 
clique. The second bound matches the second upper bound in 2 if diam = 
i7(n) and the maximum degree D is constant, e.g., for a line. 

6. For /3-elementary tasks, we prove an existential lower bound of 

n{(3- + 1)). 

This implies that the bound in 3 is tight up to a factor of (C/max/Cmin) log(I3). 




Constrained Balls into Bins Game. Our analysis crucially relies on a lower 
bound on the cost of an optimal offline algorithm. We therefore study the growth 
of the work function values on a sequence of random requests. It turns out that 
the increase in the work function values can be modeled by a version of a balls 
into bins game with dependencies between the heights of the bins, which are 
specified by a constraint graph. We call this game the constrained balls into bins 
game. We believe that this game is also interesting independently of the context 
of this paper. 

Due to lack of space, we omit the lower bounds and some upper bound proofs 
from this extended abstract. We refer the reader to [6] for a complete version of 
this paper. 



2 Work Function Algorithm 

Let S = be a request sequence, and let sq G F denote the initial 

position. Let St denote the subsequence of the first t tasks of S. For each t, 
0 < t < £, we define a function w* : F — >■ IR such that for each node u G V, 
Wt{u) is the minimum offline cost to process St starting in sq and ending in u. 
The function wt is called the work function at time t with respect to S and sq. 

Let OPT denote an optimal offline algorithm. Clearly, the optimal offline 
cost OPt[iS] on S is equal to the minimum work function value at time £, i.e.. 
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OPt[5] = min„gy{wf (t6)}. We can compute Wt{u) for each u € V hy dynamic 
programming: 

wo{u) := S{so,u), and ■u;t(M) := min{wt_i(w) + rt(u) + i5(t;, u)}. (1) 

vGV 

We next describe the online work function algorithm; see also [3,1]. Intu- 
itively, a good strategy for an online algorithm to process task r* is to move 
to a node where OPT would reside if Tt would be the final task. However, the 
competitive ratio of an algorithm that solely sticks to this policy can become 
arbitrarily bad. A slight modification gives a 2n — 1 competitive algorithm: In- 
stead of blindly (no matter at what cost) traveling to the node of minimum work 
function value, we additionally take the transition cost into account. Essentially, 
this is the idea underlying the work function algorithm. 

Work Function Algorithm (wfa): Let denote the sequence of 

nodes visited by WFA to process St-i- Then, to process task r*, WFA moves to a 
node St that minimizes Wt{v) + S{st-i, v) for all v G V. There is always a choice 
for St such that in addition Wt{st) = Wt-i{st) + rt{st). More formally, 

St := argmin{wt(u) -I- (5(st_i,u)} such that wt(st) = Wt-i{st) + nist). (2) 

v^V 



In the sequel, we use wfa and OPT, respectively, to denote the work function 
and the optimal offline algorithm. For a given sequence 5 = (ti, . . . , r^) of tasks, 
WFA [5] and OPT [5] refer to the cost incurred by wfa and OPT on S, respectively. 
By So, . . . , s^ we denote the sequence of nodes visited by wfa. 

We state the following facts without proof. 

Fact 1. For any two nodes u and v and any time t, |wt(w.) — Wt{v)\ < 6{u,v). 
Fact 2. At any time t, Wt{st) = Wt{st-i) — 6{st-i,st). 

Fact 3. At any time t, rt{st) + 6{st-i,st) = Wt(st-i) - Wt-i(st). 

3 Smoothing Model 

Let the adversarial task sequence be given by S := (fi,...,iv). We smoothen 
each task vector ft '■= (ft(ui), . . . ,ft{v„)) by perturbing each original cost entry 
ft(vj) according to some probability distribution / as follows 

rt{vj) := max{0, rt(uj) -|-e(uj)}, where e(uj)^/. 

That is, to each original cost entry we add a random number which is chosen 
from /. The obtained smoothed task is denoted hy Tt := (rt(?;i), . . . , rt{vn))- We 
use /X and a, respectively, to denote the expectation and the standard deviation 
of /. We assume that / is symmetric around /x := 0. We take the maximum of 
zero and the smoothing outcome in order to assure that the smoothed costs are 
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non-negative. Thus, the probability for an original zero cost entry to remain zero 
is amplified to 

A major criticism to the additive model is that zero entries are destroyed. 
However, one can easily verify that the lower bound proof of 2n — 1 [3,4,1] on 
the competitive ratio of any deterministic algorithm for metrical task systems 
goes through for any smoothing model that does not destroy zeros. 

Our analysis holds for a large class of probability distributions, which we 
call permissible. We say / is permissible if (i) / is symmetric around p, = 0 
and (ii) / is non-increasing in [0,oo). For example, the uniform and the normal 
distribution are permissible. Since the stated upper bounds on the competitive 
ratio of WFA do not further improve by choosing a much larger than tfmin, we 
assume that a < 2C/min- Moreover, we use c/ to denote a constant depending on 
/ such that for a random e chosen from /, P[e > cr/c/j > j- 

All our results hold against an adaptive adversary. An adaptive adversary 
reveals the task sequence over time, thereby taking decisions made by the online 
algorithm in the past into account. 



4 A Lower Bound on the Optimal Offline Cost 

In this section, we establish a lower bound on the cost incurred by an optimal 
offline algorithm opt when run on smoothed task sequences. For the purpose 
of proving the lower bound, we first investigate an interesting version of a balls 
into bins experiment, which we call the constrained balls into bins game. 



4.1 Constrained Balls into Bins Game 

We are given n bins i?i, . . . , In each round, we place a ball independently 
in each bin Bi with probability p; with probability 1 — p no ball is placed in Bi. 
We define the height ht{i) of a bin Bi as the number of balls in Bi after round t. 
We have dependencies between the heights of different bins that are specified by 
an (undirected) constraint graph Gc '.= (W, Ec)- The node set W of Gc contains 
n nodes Ui, . . . ,u„, where each node Ui corresponds to a bin Bi. All edges in 
Ec have uniform edge lengths equal to Q. Let D be the maximum degree of a 
vertex in Gc. Throughout the experiment, we maintain the following invariant. 

Invariant: The difference in height between two bins Bi and Bj is at most the 
shortest path distance between Ui and uj in Gc. 

If the placement of a ball into a bin Bi would violate this invariant, the ball is 
rejected; otherwise we say that the ball is accepted. Observe that if two bins Bi 
and Bj do not violate the invariant in round t, then, in round t + 1, Bi and Bj 
might cause a violation only if one bin, say Bi, receives a ball, and the other, 
Bj, does not receive a ball; if both receive a ball, or both do not receive a ball, 
the invariant remains true. 
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Fig. 1. Illustration of the “unfolding” for Q = 1 and h = 5. Left: constraint graph Gc- 
Right: layered dependency graph T>h- 



Theorem 1. Fix any bin B^. Let be the number of rounds needed until the 
height of Bz becomes h>\og{n). Then, P[i?z > C3/1 (1 + log(H)/Q)] <l/n^. 

We remark that there are instances, where the above bound is indeed tight. 

We next describe how one can model the growth of the height of B^ by 
an alternative, but essentially equivalent, game on a layered dependency graph. 
A layered dependency graph Vh consists of h layers, Vi, . . . , Vh, and edges are 
present only between adjacent layers. The idea is to “unfold” the constraint 
graph Gc into a layered dependency graph T>h- 

We describe the construction for Q = 1; the details for Q > 1 can be found 
in [6]. Each layer of Vh corresponds to a subset of nodes in Gc- Layer 1 consists 
of z only, the node corresponding to bin Bz- Assume we have constructed layers 
V\, . . . ,Vi, i < h. Then, Vi+i is constructed from Vi by adding all nodes, FG,,(Vi), 
that are adjacent to Vi in Gc, i.e., Li+i := Vi U FaX^i). For every pair {u,v) G 
Vi X Vi-hi, we add an edge {u, v) to Vh if {u, v) G Ec, or u = v. See Figure 1 for 
an example. 

Now, the following game on Vh is equivalent to the balls and bins game. Each 
node in Vh is in one of three states, namely unfinished, ready or finished. 
Initially, all nodes in layer h are ready and all other nodes are unfinished. In 
each round, all ready nodes toss a coin; each coin independently turns up head 
with probability p and tail with probability 1 — p. A ready node changes its 
state to FINISHED if the outcome of its coin toss is head. At the end of each round, 
an UNFINISHED node in layer j changes its state to ready, if all its neighbors in 
layer j + 1 are finished. 

Note that the nodes in layer Vj are finished if and only if the corresponding 
bins Bi, i G Vj, have height at least j. Consequently, the number of rounds 
needed until the root node z in Vh becomes finished is equal to the number of 
rounds needed for the height of Bz to become h. 

Proof (Theorem 1 ). Let Vh be a layered dependency graph constructed from Gc 
as described above. As argued above, the event {Rz < t) is equivalent to the 
event that the root node becomes finished within t rounds in Vh. Consider the 
event that the root node z does not become finished after t rounds. Then, there 
exists a bad path P := (ui, . . . , Vh) from z = vi to some node Vh in the bottom 
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layer h such that no node Vi of P was delayed by nodes other than Ui+i, . . . , Vh- 
Put differently, P was delayed independently of any other path. Consider the 
outcome of the coin flips only for the nodes along P. If P is bad then the number 
of coin flips, denoted by X, that turned up head within t rounds is at most h—1. 
Let a{t) denote the probability that P is bad, i.e., a{t) := < /i — 1] . Clearly, 

E[X] = tp. 

Observe that in T>h any node has at most D + 1 neighbors in the next larger 
layer. That is, the number of possible paths from 2 ; to any node v in layer h is 
bounded by (P + 1)^. 

Thus, P[Pz > t] < a{t){D + 1)^. We want to choose t such that this proba- 
bility is at most 1/n^. If we choose t > {i2/p){h+ h\og{D)) and use Chernoff’s 
bound [5] on X, we obtain for h > log(n) 



a{t) = P[X <h-l] < P[X < pt/2] < < 



I 

n4(P-h !)'*■ 



□ 



4.2 Lower Bound 

We are now in a position to prove the following lemma. 

Lemma 1. Let S be an adversarial sequence of i := |"c 2 n 7 (Pmin/cr -I- log(P))] 
tasks, for a fixed constant C 2 and some 7 > 1. Then, P[opt[5] < n7Pniin] < 
Xjv? . 

We relate the growth of the work function values to the balls and bins game 
as follows. For each node Vi of G we have a corresponding bin Bi. We obtain the 
constraint graph Gc from G by setting all edge lengths to Q := [Pmin/L\J , where 
A := min{P„iin, cr/cy}. Since for any Vi and any time t, P[rt(uj) > cr/cf] > j, 
we place a ball into Bi with probability |. The following lemma establishes a 
relation between the work function value of Vi and the height ht{i) of Bi. 

Lemma 2. Consider any node Vi and its corresponding bin Bi. Let ht{i) denote 
the number of balls in bin Bi after t rounds. Then, for any t > 0, Wt{vi) > 
ht{i) A. 

Put differently, the number of rounds needed until a bin Bi has height h 
stochastically dominates the time t needed until Wt{vi) > hA. Applying The- 
orem 1, we obtain that after £ := \c 2 n'^{U^i^/ a + log(P))] rounds, for an ap- 
propriate constant C 2 , the probability that there exists a bin of height less than 
2njQ is at most 1/n^. That is, with probability at least 1 — 1/n^, all Vi sat- 
isfy wi{vi) > 2njQA > nyPmin- Since OPt[5] = min„gy {■u;^(m)}, the theorem 
follows. 

We will use the Lemma 1 several times as follows. 

Corollary 1. Let S be an adversarial sequence of I := |"c 2 n 7 (Pniin/c-l-log(P))] 
tasks, for a fixed constant and an some 7 > 1 . Then, the smoothed competitive 
ratio o/wFA is at most E[wFA[5]]/(n7Pmin) + o(l). 
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Proof. Let 5 be a random variable denoting a smoothed sequence obtained from 
S. We define £ as the event that opt incurs a cost of at least on S. By 

Lemma 1, we have P[~'f] < 1/n^. Thus, 



E 



'wfa[5]' 


= E 


’wfa[5] 


8 


P[8] + E 


’wfa[5] 


~'S 


opt[5]_ 


_opt[5] 


opt[5] 









PK] 



< 



E[wfa[5] I 8] P[8] 



2n- 1 



< 



t3 



E[wfa[5]] 



^(1), 



where the second inequality follows from the definition of 8 and the fact that 
the (worst case) competitive ratio of WFA is 2n — 1. □ 



5 Upper Bounds 

5.1 First Upper Bound 



We derive the first upper bound on the smoothed competitive ratio of WFA. The 
idea is as follows. We derive two upper bounds on the smoothed competitive 
ratio of WFA. The first one is a deterministic bound, and the second one uses 
the probabilistic lower bound on opt. We combine these two bounds using the 
following fact to obtain the theorem stated below. 

Fact 4. Let A, B, and Xi, 1 < i < m, he positive quantities. We have 



min 



AYZiX. BYZiX. 

Er=i ^ 



< y/AB. 



Consider any deterministic input sequence /C of length 1. Let sq, si, . . . , 
denote the sequence of nodes visited by WFA. Define C{f) := rt{st)+5{st-i, St) as 
the service cost plus the transition cost incurred by WFA in round t. With respect 
to JC we define T as the set of rounds, where the increase of the work function 
value of S(_i is at least one half of the transition cost, i.e., t G T if and only if 
Wt{st-i) — Wt-i{st-i) > 5(st_i, sd/2. We use T to refer to the complement of 
T. Due to Fact 2 we have Wt{st~i) = wt{st) + S{st-i,st). Therefore, the above 
definition is equivalent to 

T := {t: wt{st) - Wt-i(st-i) > St)} . (3) 



We first prove that the total cost of WFA on K. is bounded by a constant 
times the contribution of rounds in T. 

Lemma 3. Let 1C be a sufficiently long sequence such that wfa[/C] > 6Diam. 
Then, wfa[/C] < 

We partition T into and T^, where := {t G T : Wt{st) — Wt-i(st) < 
4C/maxdmm}, and := T\Ti. 
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Lemma 4. Let 1C he a sufficiently long sequence such that OPt[/C] > 2Diam. 
There exists a constant b such that 



OPt[/C] > 



1 



1 

bn i U„ 



E citf 






iST2 / 



Theorem 2. The smoothed competitive ratio of WFA is 
0{^Jn ■ (?7max/C^min)(C/min/cr + log(i:)))). 

Proof. Let S be an adversarial task sequence of length i := \c 2 n'y{Uminl o' + 
log(D))], and let 5 be a random variable denoting a smoothed sequence 
obtained from S. Due to the proof of Corrollary 1 it suffices to bound 
E[wfa[5]/opt[5] I £i], where £ is the event (opt[5] > nqUmin)- Consider a 
smoothing outcome S such that the event £ holds. We fix 7 sufficiently large 
such that OPt[5] > GDiam. Observe that wfa[5] > OPt[5] > 6Diam. 

First, assume ^ C'(t). Due to Lemma 3 and Lemma 4, 

wfa[5] < 16 ^ C{t) and opt[ 5] > E cw- 

teT2 teT2 



Hence, E[wfa[5]/opt[5] \ £] = 0(1). 

Next, assume X^teTi C'(t). By Lemma 3 and Lemma 4 we 

have 



WFA [5] < 16 E^w 

tGTi 



and OPT [51 > 7*^ ( 77 
bn \U, 



1 



iGTi 



0(t)- 



( 4 ) 



Thus, 



WFA [5] 
OPT [5] 



^ 1667if/max 



VX^tGTi ) 



Since £ holds, we also have 



wfa[ 5] ^ 16^I]tGTiC'W 
opt[5] 



< 






Ur, 



log(D) ) 



|Ti| J ’ 



( 5 ) 

(6) 



where the last inequality holds for an appropriate constant c and since i > |T^|. 
Observe that (6) is well-defined since 
WFA[5] > 6Diam imply that [T^j > 1. 

Applying Fact 4 to (5) and (6), these two bounds are combined to the one 
stated in the theorem. □ 



5.2 Second Upper Bound 

Our second upper bound easily follows from the proof of Corrollary 1 and the 
following deterministic relation between WFA and opt. 

Lemma 5. Let 1C be any request sequence of length £. Then, wfa[/C] < OPt[/C]-|- 
Diam ■ £. 

Theorem 3. The smoothed competitive ratio of WFA is 0{{Diam/Umin) ■ 

{Um'rn/o + log{D))). 
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5.3 Potential Function 

The next lemma can be proved using a potential function argument. Intuitively, 
it states that the expected cost of wfa is bounded by the expected cost of a 
simple greedy online algorithm. 

Lemma 6. Let S be a smoothed sequence of £ tasks. For each t, 1 < t < £, 
and a given node s, define a random variable At{s) := min„gy{rt(M) + J(m, s)}. 
Let K > 0. Lf E[Z\t(s)] < k, for each s G F and for each t, 1 < t < £, then 
E[wfa[5]] < 4k£ + Diam. 

5.4 Random Tasks 

We derive an upper bound on the expected competitive ratio of WFA if each 
request cost is chosen independently from a probability distribution / which is 
non-increasing in [0, oo). We need the following fact. 

Fact 5. Let f be a continuous, non-increasing distribution over [0, oo) with 
mean /i and standard deviation a. Then, /x < ^/\2a. 

Theorem 4. Lf each request cost is chosen independently from a non-increasing 
probability distribution f over [0,oo) with standard deviation a then the expected 
competitive ratio o/wFA is 0(1 -I- (cr/C/min) • log(O)). 

Proof. Let 5 be a random task sequence of length £ := \c 2 nj{Un,in/a-)-\-log{D ))~\ , 
for an appropriate 7 > C/maxj generated from /. Observe that since 7 > 
we have £ > Diam. For any t and any node s, we have At{s) = min„gy{rt(M) -I- 
5(u, s)} < rt(s). Since rt(s) is chosen from /, Fact 5 implies that E[Z\t (s)] < 
K := vT^cr. Thus, by Lemma 6, we have E[wfa[5]] = 4:\/T2a£ -\- Diam = 0(a£). 

Note that we can use the lower bound established in Section 4 to bound the 
cost of OPT: The generation of S is equivalent to smoothing (according to /) 
an adversarial task sequence consisting of all-zero request vectors only. Here, we 
do not need that the distribution / is symmetric around its mean. The theorem 
now follows from Corrollary 1. □ 

5.5 /3-Elementary Tasks 

We can strengthen the upper bound on the smoothed competitive ratio of wfa 
if the adversarial task sequence only consists of /3-elementary tasks. Recall that 
in a /3-elementary task the number of non-zero request costs is at most /3. 

Theorem 5. Lf the adversarial task sequence only consists of (3-elementary 
tasks then the smoothed competitive ratio o/wFA is 0(/3([/niax/Lmin)(CAnin/o' -I- 
log(Z3))). 

The proof follows easily from the following lemma. Lemma 6 and Corrol- 
lary 1 . 

Lemma 7. Let Tt be a task obtained by smoothing a (3-elementary task, where 
(3 < n. Then, E[Z\t(s)] < ct -I- for each node s € V. 
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6 Conclusion 

In this paper we focused on the asymptotic behaviour of WFA if the request 
costs of an adversarial task sequence are perturbed by means of a symmetric 
additive smoothing model. We showed that the smoothed competitive ratio of 
WFA is much better than its worst case competitive ratio suggests and that it 
depends on topological parameters of the underlying graph. Moreover, all our 
bounds, except the one for /3-elementary tasks, are tight up to constant factors. 
We believe that our analysis gives a strong indication that the performance of 
WFA in practice is much better than 2n — 1. 

An open problem would be to strengthen the universal lower bounds. More- 
over, it would be interesting to obtain exact (and not only asymptotic) bounds 
on the smoothed competitive ratio of WFA. 
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Abstract. Recently, Charikar et. al. investigated the problem of eval- 
uating AND/OR trees, with non uniform costs on its leaves, under the 
perspective of the competitive analysis. For an AND-OR tree T they 
presented a /i(T)-competitive deterministic polynomial time algorithm, 
where At(T) is the number of leaves that must be read, in the worst case, 
in order to determine the value of T. Furthermore, they prove that p(T) 
is a lower bound on the deterministic competitiveness, which assures the 
optimality of their algorithm. 

The power of randomization in this context has remained as an open 
question. Here, we give a step towards solving this problem by presenting 
a 0.792/i(T)-competitive randomized polynomial time algorithm. This 
contrasts with the best known lower bound p(T)/2. 



1 Introduction 

A game tree is a rooted tree, where every internal node has either a MIN or 
MAX label and the parent of every MIN (MAX) node is a MAX (MIN) node. 
Every leaf is associated to a real number, its value. The value of a MAX (MIN) 
node is recursively defined as the maximum (minimum) among the values of its 
children. The value of a tree is the value of its root. Game trees play a central 
role in Artificial Intelligence, particularly in game-playing programs. 

The AND / OR tree is a particular case of a game tree, where every leaf has 
either value 0 or 1. It is easy to see that a MAX node can be thought as an 
OR gate while a a MIN node can be thought as an AND gate. The AND/OR 
trees are interesting in its own right since they have applications in mechanical 
theorem proving. 

Several authors [1,2,3] have considered the problem of determining the value 
of game trees and AND/OR trees by reading as few as possible leaves. In [4], 
Charikar et al. investigated this problem under the perspective of the competitive 
analysis. They consider the more general problem where the cost of reading a 
leaf Xi is Ca,. and the cost of evaluating a tree is the sum of the costs of the 
leaves that are read in this process. This variant was motivated by possible 
Internet applications where the costs of the information required to take some 
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Edital Universal 01/2002 (Proc. 476817/2003-0) 
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decision may vary depending on the acquisition source. What has brought us 
into this problem, however, are database applications where queries involving 
no conventional data like images, DNA sequences and tables are handled [5,6]. 
These queries differ from the traditional ones, since the processing of attributes 
like images and sequences are much more expensive than that of usual alpha- 
numeric registers. Since AND/OR trees model an important class of queries, 
they are particularly important in this scenario. 

Now, we explain the competitiveness metric proposed in [4], which will also 
be adopted here. Let / be a function over a set of variables V = {xi,X 2 , ■ ■ ■ , x„} 
( a game tree can be thought as a function /, where the leaves correspond to 
the variables in R). Each variable Xi has a non-negative cost and the vector 
c =< CxiT ■ ■ ,Cx„ > is called the cost vector. Given U C V, we define the cost 
of U as the sum of the costs of its variables. A setting a of the variables is 
the choice of a value for each variable. The partial setting restricted to U C V 
is denoted hy a\u . A set U C E is sujficient with respect to a if the value of 
/ is determined by the partial setting a^jj. Such a C/ is a proof (certificate) of 
the value of / under a. The cheapest proof of the value of / under a is thus 
the sufficient set with minimum cost. We use c^{a) to denote the cost of such a 
proof. 

For example, consider the AND/OR tree T presented in Figure 1. For the 
setting <jR = (0,0,0, 1, 1), we have c^{<jr) = 3-I-5-I-2. On the other hand, for the 
setting as = (0, 1, 1, 1, 1), we have (P"{as) = 2 -|- 6 -I- 4. An evaluation algorithm 




X, Xj 



Fig. 1. An AND/OR tree T with vector cost c =< 3, 5, 2, 6, 4 >. 



for / sequentially reads the variables in V. The algorithm stops when the set of 
variables read so far is sufficient with respect to a. The cost of the algorithm A 
for a setting a is given by c^{a). As an example, let A be an algorithm that reads 
the variables following the sequence X 1 X 2 X 3 . . . and skipping those variables that 
cannot affect, at the current point, the value of / anymore. Thus, for the tree T 
in Figure 1, c^{aR) = 10, since A reads the leaves X\,X 2 , X 3 . On the other hand, 
c'^(as) = 18, since in this case it reads Xi,X 2 ,X 4 ,X 5 . 

The competitiveness of A is defined by 



if if) = max„ 



cf{a) 

/ 
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The best possible competitive ratio for any deterministic algorithm, then, is 
7 / = min_ 4 { 7 ^(/)}, where the minimum is got over all possible deterministic 
algorithms A. For the case where / is an AND/OR tree function, Charikar et. 
al. [4] present a pseudo-polynomial 7 / -competitive deterministic algorithm. 

Furthermore, they studied the dependence of the competitive ratio on the 
structure of /, defining the extremal competitiveness 7 (/) of / as j(f) = 
maxc7/. 

This measure captures somehow the complexity of /, leaving the cost vector 
at the background. For the case where / is an AND/OR tree T, they show that 
'j(T) = max{k(T), l(T)}, where k(T) and l(T) are, respectively, the number of 
leaves that must be read in the worst case in order to guarantee that the value 
of T(cr) is 1 or 0. A simple method to calculate these values is described in 
Section 2. 



1.1 Our Result 



The main open direction according to Charikar et. al [4] is understanding the 
power of randomization in this context. Here, we give an important step in this 
direction. Given an algorithm A, its randomized competitiveness is defined by 



= maXfjE 




= maxa 



E[c{^{cr)] 

cf{a) 



The optimal randomized competitiveness is defined by S[ = min^ S:^{f). 
Finally, the extremal randomized complexity of / is defined by S{f) = maxc S^. 
In [4], the following is observed. 

Theorem 1. If T is an AND/OR tree, then S{T) > {1 + max{k{T),l{T)})/2 

Clearly, 6{T) < 7 (F) = max{k{T),l{T)}, since any deterministic algorithm 
can be viewed as a randomized algorithm. Here, we show that S{T) < 0.792 
max{(fc(T), 1{T)} 

This result is proved through the analysis of a randomized polynomial algo- 
rithm that combines three key ideas: an optimal way to evaluate AND / OR trees 
with depth at most 2 ; a “binarization” of trees with unrestricted depth and a 
variation of the WeakBalance algorithm proposed in [4] which specially handles 
nodes whose children are roots of trees with depth at most 1 . The main question 
that remains open is whether S{T) = (1 + max{{k{T) , 1{T)}) /2 or not. 



1.2 Related Work and Paper Organization 

Given an AND/OR trees T, it is known that any deterministic algorithm, in the 
worst case, must evaluate all leaves of T before determining its value. 

Tarsi [1] considered the problem of minimizing the expected number of eval- 
uated leaves for a distribution probability in which every leaf has probability p 
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of having value 1. He has proved that certain class of deterministic algorithm 
are optimal for balanced trees (a class that includes uniform trees). 

For binary trees, where every internal node has exactly two children and every 
leaf is at distance 2k from the root Snir [2] presents a randomized algorithm 
which reads at most leaves in the average, where n = 2^^ is the total 

number of leaves. In [3], Saks and Wigderson show that Snir’s Algorithm reads, in 
fact, leaves in the average. Furthermore, they prove that this algorithm 

is optimal. For general AND/OR trees, they present techniques for generating 
upper an lower bounds on the expected number of leaves that need to be read. 

The paper is organized as follows. In Section 2, we introduce some ad- 
ditional notation and state some facts that will be useful throughout this 
text. In Section 3, we prove that 6{T) = 1 -|- max{fc(T), ^(T)}/2 for every 
AND/OR tree T with depth at most 2. Finally, in Section 4, we prove that 
5{T) < 0.792 max{fc(r), ?(r)}, the main result of this paper. 

2 Notations and Basic Facts 

Let T be a rooted tree with costs on its leaves. Define h{T) as the depth of T, 
that is, the longest path from the root of T to a leaf. If T is a leaf, h{T) = 0. 
Given a node x in T, let be the maximal (w.r.t node inclusion ) subtree of 
T rooted in x. We use ct to denote the sum of the costs of the leaves of T. 
Throughout this text we use r to denote the root of T and Ti,. . . ,T^ to denote 
the subtrees rooted at the children of r. 

A general AND/OR tree (G- AND/OR tree) T is a rooted tree where every 
internal node has either an AND or OR label. Furthermore, to each leaf a; of T 
it is associated a cost Cx and a bit value. The value of an AND internal node is 
1 if all of its children have value 1 and it is 0, otherwise. The value of an OR 
internal node is 0 if all of its children have value 0 and it is 1, otherwise. In 
some occasions, we use the term variables of T to refer to the leaves of T. The 
value of T for a setting a is denoted by T{a). As an example, for the setting 
an = (0, 0, 0, 1, 1) in Figure 1, we have T{(Jr) = 0. We say that two G- AND/OR 
trees T and T' are equivalent if T(cr) = T'(ct) for every a. Whenever the context 
is clear we use abuse the notation by using a to refer to the partial setting 
restricted to the leaves of Tx- 

An AND/OR tree is an G-AND/OR tree where the parent of every AND 
(OR) node is an OR (AND) node and each internal node has at least two children. 
A single leaf is a trivial AND/OR tree. It is easy to verify the following fact: every 
G-AND/OR tree T is equivalent to an AND/OR tree T' such that h{T') < h{T) 

The functions k{T) and 1{T) can be calculated as follows. If T is a leaf 
then k{T) = 1{T) = 1. If r is an AND node, then k{T) = X)?=i ^{Ti) and 
1{T) = maxj^i fc/(Tj). Similarly, if r is an OR node, then 1{T) = 
and k{T) = maxi=i,.._fc fc(Ti). For example, in the tree of Figure 1, we have 
k{T) = 1{T) = 3. 

In order to make this reading easier we provide a list of notations: c^((j), cost 
of the cheapest proof for the value of T when the setting is cr; ct, sum of the 
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costs of the leaves of T; c^(cr), cost incurred by algorithm A to determine the 
value of T when the setting is cr; T{a), value of T when the setting is a h{T), 
depth of T. 

3 Evaluating Trees of Depth at Most 2 

In this section, we prove the following theorem 

Theorem 2. If T is an AND/OR tree and h{T) < 2, then 6{T) = (1 + 
max{(fc(T),Z(r)})/2. 

In [3], Saks and Wigderson defined the class of directional algorithms. An 
algorithm is directional if it reads the leaves of T following a depth first search 
in T, in which the next child of the current node to be visited is randomly 
selected according to some probability distribution. 

The proof of Theorem 2 follows from the analysis of EVAL, a directional 
algorithm presented below. What makes EVAL interesting is the probability 
distribution employed, in which the next subtree to be visited is selected with 
a probability that depends on the square of the inverse of the sum of its leaves 
costs. 



EVAL(T: AND/OR tree ) 

If T is a leaf then Read T and Return the value of T 
5^{l,...,fe} (*) 

For each i £ S make Wi <— , ^ 

(CTj ) 

While S / 0 do 

Selects an index i from S with probability WijW 
If the root of T is an OR gate then 

If EVAL(Ti)=l then Return 1 

Else 

If EVAL(Ti)=0 then Return 0 
S^S-{i\ 

If the root of T is an OR gate then Return 0 
Else Return 1 



Fig. 2. Eval Algorithm 



We have the following lemma whose proof we defer for the extended version 
of the paper 

Lemma 1. Let Pr[S,i,j] probability of EVAL selecting the index i before 

the index j. Then, Pr[S,i,j] < {wi)/{wi + wj) 
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Lemma 2. Let T he an AND/ OR tree with depth at most 1 and let a be a setting 
forT. IfT(a) = 1, then E\c/i^Ar(a)]/c^(a) < (l + l(T))/2. On the other hand, 
= 0, then E[c^y {a) < (1 + k{T))/2. 



Proof: If h{T) = 0, then T is trivial and k{T) = 1{T) = 1. Therefore, the 
result holds. 

Assume that h{T) = 1. We only present the proof for the case where T(cr) = 
0, since the proof for the other case is similar. 

Subcase 1) r is an OR node. In this case, k{T) = I and the minimum proof 
consists of all leaves, so as cly^^{a)/c^{a) = l = {k{T) + l)/2 

Subcase 2) r is an AND node. In this case, k{T) is the number of leaves 
in T. Let Xj be the leaf with minimum cost among those with value 0 and 
let Xij be a random variable defined as follows: Xij = 1 if Xj is evaluated 
by EVAL before Xj and Xij = 0, otherwise. If i = j, define = I. Then, 

E[clyjid^)]lc^{a) < 

It follows from Lemma 1 and from the linearity of the expectation that 






k(T) ^ 



< l+(fc(T) — 1) max 



+ Cx 



^ 1 + fc(T) 

2 



where the last inequality follows from the arithmetic-geometric inequality. □ 
We can obtain a similar result for trees with depth 2 whose proof we defer 
for the extended version of the paper. 



Lemma 3. Let T he a AND/OR tree with depth 2 and let a he a setting for 
T. LfT{a) = 1, then E[c/,yjjj^{a)]/c^ (a) < (1 -I- l{T))/2. On the other hand, if 
T{a) = 0, then E[c^^^^(ct)]/c^(ct) < (1 -b k{T))/2. 



4 Evaluating AND/OR Trees of Unrestricted Depth 

In this section, we describe the RWB algorithm, which combines the ideas pre- 
sented at the previous section with some of the ideas introduced in the algorithm 
WeakBalance [4]. For convenience, we explain the algorithm using a G- AND/OR 
tree T' obtained through a set of transformations on the given AND/OR tree T 
that we denote by binarization. This new tree has the following properties 

(i) T' is equivalent to T, that is, T(cr) = T'{a), for all a ; 

(ii) k{T) = k{T') and 1{T) = 1{T') 

(iii) If X is an internal node in T' and h{T/) > 3, then x has exactly two 
children. 

It is easy to obtain such a tree T' starting from T. Basically, while the current 
tree has a node x that does not satisfy the condition (iii) , then the following rule 
is applied 

Binarization Rule: Let LAB G {AND, OR} be the label of x and let 
Ni,N 2 , . . . , Nk, with A: > 2, be the children of x. Replace x by an internal node 
x' with two children N\ and Nq. Assign label LAB for both x' and A/. Make 
N 2 , , Nk be the children of Nq. 
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This rule is applied until a tree T' with the desired properties is obtained. It 
is easy to verify that T' satisfies the desired conditions. Let g be any function 
of k{T) and 1{T). One can prove that c^(cr) < g{k{T),l{T)) by showing that 

(ct) < 9 {k{T') , 1{T')) . We will use this fact in some proofs. 

4.1 The Random Weak Balance Algorithm 

The algorithm gets as input an AND/OR tree T. If h{T) < 2, then EVAL(T) 
is executed. Otherwise, T is converted into a G- AND/OR tree T' through the 
binarization process. 

If h{T') > 3, RWB executes a loop, where at each iteration exactly one 
leaf is read. A pseudo-code is presented in Figure 3. Every node x stores a 
recommendation and a variable Costa,. The recommendation is a pair (L,cl), 
where L is a leaf in T/ of cost ci,. It defines the leaf, among those in T/, that will 
be firstly read by RWB from the current iteration. While the recommendation 
stored by a leaf L is always {L,cl), the recommendation of an internal node is 
updated during the execution of RWB to that stored by one of its unevaluated 
children. This is detailed in the recommendation scheme presented in the next 
section. The variable Costa, keeps track of the cost that RWB has incurred in 
the subtree T/, that is, the sum of the costs of the leaves of T/ evaluated so 
far. This information is used in the recommendation updating process. In the 
pseudo-code, the operation Simplify evaluate as maximum as possible T' after 
reading the leaf L. 



RWB Algorithm 

For every x G T' do Cost^ 0. 

Initialize the recommendation for every node traversing T' bottom-up. 
Let (L,cl) be the recommendation stored by the root of T' 

Read L; Simplify T' . 

While the value of T' remains unknown do 

For every ancestor a; of L do Costx <— Costx + cl 

Let xi, . . . ,Xp be the unevaluated ancestors of L sorted by 

increasing order of distance to L 

For i = 1 . . .p do update the recommendation of Xi (*) 

Let (L,cl) be the recommendation stored by the root of T' 

Read L\ Simplify T' . 



Fig. 3. The RWB Algorithm 



The Recommendation Scheme. The recommendation scheme defines how 
the recommendation of a node is updated (initialized) during RWB execution. 
In fact, it provides the order in which the leaves are read by RWB. In particular, 
when the recommendation of an internal x node is updated, it defines the first 
leaf among those recommended by the x children that will be read. 
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In order to describe this scheme in detail, we distinguish between three types 
of nodes. A node x is 

— white if h{T^) < 2; 

— grey if both h{T^) > 2 and x has a child y, with h{Ty) < 1; 

— black if X is neither white nor gray. 

The motivation behind this classification is that the evaluation of both white 
and gray nodes can be optimized using randomization. In fact, we have seen that 
white nodes can be efficiently evaluated through procedure EVAL. 

Now, we present the recommendation scheme for white nodes. 

White Nodes. Let x be a white node and let Li, L 2 , ■ ■ ■ , Lk be the ran- 
dom sequence of leaves that are read when EVAL(T 2 ,) is executed. Then, at the 
beginning x holds recommendation (Li,cl-^). When is read, 1 < i < fc — 1, 
X recommendation is updated to (Li+i, This assures that the order in 
which the leaves from are read matches with the order defined by EVAL(T^). 

Now, we explain the scheme for both black and grey nodes. If x is either 
a black or grey node in T', then h(T^) > 3 and x has exactly two children 
that we denote by Ni and N 2 . From now on, we assume w.l.g. that h{T'j^^) < 
h{T'^^). Moreover, we assume that Ni and N 2 hold recommendations (Li,CiJ 
and {L 2 ,cl^), respectively. 

Black Nodes. If x has only one unevaluated child, say y, then the recom- 
mendation of X is updated to that of y. 

Otherwise, the recommendation of x is updated to {Li,CLi), with i G {1,2}, 
such that {CostMi + clJ / is minimized, where ) = k{T'j^_) if x is an 

AND node and otherwise. 

We remark that the recommendation scheme for the black nodes is exactly 
the one adopted by the algorithm WeakBalance [4]. 

Gray Nodes. If x has only one unevaluated child, say y, then the recom- 
mendation of X is updated to that of y. Otherwise, RWB takes advantage of the 
following observation: 

Observation 3 if the cheapest proof for the value of a grey node x consists of 
only leaves from then the cost of the cheapest proof for T'j^^ is 

Roughly speaking, the scheme works as follows. First, it defines a threshold 
parameter px whose value is related to ct'^ . Then, while the cost incurred in 
T^^iCostjq^) is smaller than p^, the recommendation from N 2 is selected. In 
some sense, by taking this decision, the scheme is implicitly assuming that the 
cheapest proof consists only of leaves from . Even if such assumption is 
not correct, it is not a big problem, once the cost spent in the “wrong” tree, 
is not large at all. However, if CosIn^ becomes comparable to Px^ the 
scheme reviews its policy by tossing an unbiased coin. Depending on the result, 
it either keeps selecting recommendations from N 2 or it changes to those from 
Ni. Finally, if CostN^ becomes larger than 2px, then the scheme only accepts 
the recommendations from This avoids RWB spending too much in 
when the cheapest proof consists only of leaves from . 
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More formally, the scheme is implemented as follows: let Px be a threshold 
parameter whose value will be defined later in the analysis and let b{x) be a 
random bit obtained at the beginning of RWB execution (the value of b{x) does 
not change throughout the execution). We have the following cases: 

1. CosIn^ + Ci 2 ^ Px- Then, the recommendation of x is updated to {L 2 ,cl^). 

2. px < CosIm^ + Ci 2 — If b{x) = 0, then the recommendation of x is 

updated to Otherwise, it is updated to {L 2 ,cl^). 

3. CosIn^ +cl 2 > “^Px- Then, the recommendation of x is updated to (Ti, cli). 



4.2 RWB Analysis 

In order to establish our main result, E[c^t^ g{a)] / {<r) < 0.792 max{fc(T), 
1{T)}, we first prove by induction that for every node x of T' , the tree ob- 
tained from T by binarization, we have if (cr) < max{ai(T() 
l{T^),ao(T^)k{T^)}, where «o and ai are functions defined at appendix A that 
associate a G- AND/OR tree to a real number. Then, our main result is estab- 
lished by proving upper bounds on both ai and ao- 



Lemma 4. Let T' be the G- AND/OR tree obtained from the input AND/OR 
tree T. Furthermore, let x be a node in T' and let u be a setting for T' . IfT^(a) = 
1, then — 'ai(7/,)Z(r/). On the other hand, if T^{a) = 0, 

then E[c]/^g{(7)]/c^^{(7) < ao{T/)k{T/) 



Proof: We only consider the case where = 1, since the proof for the 

other case is similar. The proof is by induction on the height of T/ The basis 
are the white nodes. If h{T/) < 2, it follows from lemmas 2 and 3 and from the 
definitions of a for white nodes that the result holds. Now, let x be a node of T' 
such that h{T/) > 3. 

Subcase 1) x is either a gray or black internal node with label AND. In this 
case, the cost of the minimum proof for T/ is the sum of the costs of the minimum 
proofs for its children, that is, c^®(ct) = c^'^i (a) + (cr). Moreover, the value 

of T/ is determined right after the value of the last of its children is determined. 



Tf 






TL. 






Hence, E[cf/^g{u)] — E[c^/^g{a) + c^vvb('^)] ~ 

t' 

It follows from the inductive hypothesis that E[ci//g (^)]/c 
ai{T//jl{T//), for t = 1,2. Thus, we have that 



t'm. , , T' 



t' 

RWBy 
^/a) < 



c-^/a) 



t'„ 



tL 



E[ c/;fR{a)]+E[c//^{a)] 
(cr) -t (cr) 



< max {ai{T//l{T//} = a/T'x)l{T'), 



where the second inequality from left to right follows from the fact that (a -I- 
b)/{c + d) < max{a/c, 6/d} if a,b,c,d are positive real numbers. Moreover, the 
rightmost expression is a consequence of the definition of , equations (5) and 
( 12 ). 

Subcase 2) cr is a gray internal node with label OR. Since B{cr) = 1, we have 
two possibilities: either the cheapest proof consists of leaves from or from 
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. First, we consider the case where the cheapest proof consists of all the leaves 
in T'j^^ (recall Observation 3). Analyzing the cases 1-3 of the recommendation 
scheme for grey nodes, we can conclude that 



tL 



^ ct' -b 1.5p^ 



( 1 ) 



Let us consider the case where the cheapest proof consists of some leaves in 



and let Po = Pr 



N2 

-RWB 



(o-) > Px 



Then, let Pi = Pr c^^g{a) > 2p^ 

Pi- 

Since, by inductive hypothesis, the expected cost incurred at N 2 when its 
value is determined is at most (^a), we have that 



Pl2px + P 2 PX < 0;i(T^J/(r((rJc^'^2(cr) (2) 

Now, we give an upper bound on A[c^^y^(cr)]. Assume that RWB spends z to 
determine the value of If z > 2px, then the item 3 of the recommendation 
scheme for OR grey nodes assures that RWB spends z + ct'^ to determine 
T^{cr). Otherwise, RWB pays at most z + Cj-'^ with probability 1/2 and pays 

z with probability 1/2. Taking the expectation of z we get that A[c^^g(cr)] < 
PiCT'^^ +0.5P2Ct;,^ +ai(T^JZ(T/,Jc^-2(a) . 

It follows from the equation above and from inequality (2) that 



(cr)RTL )ai(rL Icj'/ , 

A[c^Vb(^)] < (3) 

^Px 



Hence, it follows from inequalities (1) and (3) that 



c^Ha) 



< max <1-1- 



3px , , ' 



2cx 



Ni 



“^Px 



(4) 



At this point, we can finally define a suitable value for px by setting it as 
the value that equalizes the arguments of the max expression above. One can 
verify, that this value is exactly {ai{T^){l{T'j^^) -b 1)2 ctj^ — )/3, where 

o;i(r/) is given by equation (10). Thus, we have that (cr)]/cT»(cr) < 

1 + 1.5p,/ct^^ = ai(T/)(?(T^J + 1) = Oi(T/);(T/) 

Subcase 3) a; is a black internal node with label OR. In this case, the cost 
of the minimum proof for T/ is equal to the cost of the minimum proof for one 
of its children that outputs 1. We assume w.l.g. that fVi is such a child. Then, 

cLr ((j) = 

Let Cl and C 2 be, respectively, the costs incurred at and when 
the value of T'^_^ is determined. The recommendation rule assures that C 2 < 
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(l{T'j^^)ci) / . Thus, the cost incurred at when the value of is deter- 
mined is bounded above by Ci -I- l{T'j^^ci/l{T'j^^) = l{T^)ci/l{T'^^). Therefore, 









where the second inequality from left to right follows from the application of the 
inductive hypothesis on T'^^. □ 

Now, we prove our main theorem by describing a method that provides upper 
bounds on both oq and oi. 



Theorem 4. Let T he a non-trivial AND/OR tree. Then, for every setting a, 
-®[cwb(o')]/c^(o') < 0.792 max{fc(T),Z(r)}. 

Proof: If either h{T) = 1 or h{T) = 2, we have that max{fc(T), /(T)} > 2. 
By inspection, one can verify that El[(^w ^ ~ = 

(1 -hmax{A:(T),;(T)})/2 < 0.75max{fc(T), /(T)} 

Let us consider the case where h{T) > 3. Let T' be the tree obtained from 
T through the binarization process. We only consider the case where T'(cr) = 1, 
since the proof for the other case is similar. If we prove that ai(T') < 0.792, 
then the Lemma 4 assures the correctness of the theorem. 

Let cc be a node in T', with h{T/) > 2. We define a function g such that 
g{x) > a\{T/) as follows 



g{x) 



0.75, if cc is a white node 
max{ 5 (A^i), 5 (A^ 2 )}j if x is a black node 
< max{0.75, (/(fV 2 )}, if a; is a grey AND node 

g{N2)l{Tf^) + l+ /{g(N:,)l(T' )Y+g(N.2)l(.T' ) + l 

^ mTf/ + l) ’ 



if X is a grey OR node 



We observe that the expression for grey OR nodes decreases with 1{T// (the 
proof for that demands some algebraic manipulation) and increases with g(N 2 ). 

By carefully considering the definitions of g and «i, one can verify that 
cn{T^) < g{x) for every x such that h{T^) > 2. Furthermore, it is easy to 
verify that only an OR grey node may have g value larger than the maximum 
g value of its children. Hence, g{r), where r is the root of T' , is either equal 
to 0.75 or to the g value of the gray OR node with maximum g value in T' . 
Another careful analysis shows that we can obtain an upper bound on g(r) by 
computing the maximum of the sequence {q\,q 2 , ■ ■ ■ ,qh(T'))i where q\ = 0.75 

and <7i+i = {^/i -I- 1) -I- 1 -I- -\/ {q/i + 1))^ + + 1) + /(2t + 4). 

Through some calculations we obtain that qi < 0.792, for all i □ 
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A The Function a 



Here, we give recursive definitions for ao{T') and o;i(r'). At a first view, these 
definitions (equations (5)-(12)) seem to be rather non-intuitive. However, they 
become much more natural when the reader examines the proof of lemma 4. 
Thus, we strongly suggest the reader to skip the definitions below and come 
back to them whenever they are refered in the proof of such a lemma. 

White Nodes. If a; is a white node then define ao(T') = (k(T') + l)/2fc(T') 
and ai(T') = (Z(T') + l)/2l(T'). 

Black Nodes. If x is an AND node, then define 



ai(T') = 



max{ai (T^J ? (T^^ (T^, ) } 



m) 

ao{T^) = max{ao(7lrJ,ao(lT)VJ}. 

If X is an OR node, then define 

(rj., . max{ao {T'N,)k {T'^^ ) , oq J fc(T^ J } 

= hm 

=max{ai(T;^J,ai(r;;^J} 

Gray Nodes. If x is an OR node, then define 

max{ao {T'N,)k ) , oq J fc(T^ J } 






k{n) 



(5) 

( 6 ) 

(7) 

(8) 

(9) 



+ 1 + + ai(T^J/(T^J + I 

+ 1 ) 

If X is an AND node, then define 

ao(T^J/c(T^J + I + ^{ao{T'^Jk{T'^jr + ao{T'j,Jk{T'^J + 1 
2(fc(T^J + l) 



ao{TL) = 



ai(T') = 



max{ai (T^^ (T^^ ) } 

m) 



( 11 ) 

(12) 
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Abstract. The plurality problem with three colors is a game between 
two participants: Paul and Carol. Suppose we are given n balls colored 
with three colors. At any step of the game, Paul chooses two balls and 
asks whether they are of the same color, whereupon Carol answers yes 
or no. The game ends when Paul either produces a ball a of the plurality 
color (meaning that the number of balls colored like a exceeds those of 
the other colors), or when Paul states that there is no plurality. How 
many questions L{n) does Paul have to ask in the worst case? We show 
that 3{n/2j - 2 < L(n) < [5n/3j - 2. 

1 Introduction 

The problem considered in this paper is a generalization of the well known 
majority problem in which we are given n balls colored with two colors, for 
example white and black, and two players Paul and Carol playing the following 
game. At any stage of the game Paul chooses two balls x and y and asks whether 
they are of the same color. Carol can answer yes or NO. The game ends when 
Paul either produces a ball a of the majority color (meaning that the number of 
balls with the color of a exceeds those of the other color), or when Paul states 
that there is no majority (this happens when, in case n is even, there is the same 
number of white and black balls). The majority problem asks to determine how 
many questions Paul needs in the worst case. This kind of problems finds several 
interesting applications in the field of fault diagnosis of systems {e.g. see [5]). 

The majority problem was first solved by Saks and Werman [8], later Alonso, 
Reingold, Schott [3] gave a different proof. The elegant combinatorial result is 
that n — v{n) questions are necessary and sufficient in the worst case, where v(n) 
denotes the number of I’s in the binary representation of n. Alonso, Reingold, 
Schott [4] also gave the solution for the average case. Aigner [2] introduced 
several variants and generalizations of this problem. In particular, in the (n, k)- 
majority game Paul must exhibit a /c-majority ball z (that is, there are at least 
k balls colored like z), or declare there is no fc-majority. De Marco and Pelc [6] 

* Work supported in part by the European RTN Project under contract HPRN-CT- 
2002-00278, COMBSTRU. 



V. Diekert and M. Habib (Eds.): STAGS 2004, LNCS 2996, pp. 513—521, 2004. 
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considered randomized solutions for the majority problem in the more general 
case when the balls correspond to the nodes of an undirected graph and the 
comparisons can only be made between adjacent nodes (of course, the problem 
reduces to the original majority problem on the complete graph). 

Another natural generalization is to consider more than 2 colors. In this case, 
two possible problems can arise: we can either seek for a majority ball a (that is, 
there are at least [n/2j +1 balls colored like a; if no color is used on more than 
[n/2j balls, Paul has to state that there is no majority) or for a plurality ball b. 
In this case Paul has to produce a ball b of plurality color (that is, the number 
of balls colored like b exceeds those of the other colors), or state that there is 
no plurality. It is worth to observe that if a majority ball exists, then this is 
also a plurality ball; while a plurality ball might exist when there is no majority 
ball (in case there are only 2 colors there is no difference between majority and 
plurality) . 

As for the first problem, Fisher and Salzberg [7] solved the majority problem 
when the number of colors is any integer up to n, by showing than |"3n/2] — 2 
comparisons are sufficient and necessary. 

As for the plurality problem, it seems to be surprisingly difficult: while it is 
mentioned in the 1997 Alonso et al. paper, no results were known, even for the 
case of 3 colors. 

In this paper we consider the plurality problem with 3 colors. We exhibit an 
algorithm that solves the problem using [5n/3j — 2 comparisons in the worst 
case. On the other hand, in Section 3 we show that any algorithm that correctly 
determines the plurality must use at least 3[n/2j — 2 comparisons. Note that it 
was not previously known that n + 0(1) comparisons would not suffice. 



2 The Upper Bound 

In this section we show that Paul has a strategy that uses no more than 5n/3 — 2 
comparisons to solve the plurality problem. To indicate a test (comparison) 
between two balls a and b, we use the notation a : b. The outcome of a test 
(the answer given by Carol) might be yes or NO. We say that Paul wins when 
the game ends and Paul gives the correct solution. Let L(n) be the number of 
comparisons that Paul has to ask in the worst case. 

Theorem 1. We have L{n) < |n — 2, for n>2. 

Proof. The proof is by induction on n. This is clear for n < 3, so let us assume 
n > 4. Paul arranges the balls 6i,...,6„ and compares them one by one 
according to Phase I. A color class is a set of balls having the same color. 

Phase I. The phase consists of a sequence of states. Every state Si (after bi has 
been handled) is inductively described by a vector {ki, ii,mi), where h > ii > mt 
are the the color classes cardinalities. For f > 1, let r* = n — z be the number of 
the remaining balls (those that have not been involved in any comparison yet) 
and set ti = ri — {ki — £i — 1). The phase ends at state 5^, for z > 1, when one 
of the following conditions arises: 




(A) ki = ii = nii] 

(B) U = 0; 

(C) U = 1. 
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(Notice that (A) and (B) can not arise together, as well as (B) and (C). 
Moreover, if (A) and (C) hold, then i = n.) 

Condition (A) simply says that the three color classes have the same cardi- 
nality. The problem can, thus, be reduced to the same problem with smaller size 
(n — 3ki) and Paul can use induction. 

The special cases when ti = 0,1 give a precise indication on the plurality and 
Paul can handle them easily. 

Claim. Paul has a strategy such that at every state Si of Phase I, the following 
conditions hold: 

(i) ki>£i> mi] 

(a) a representative ball Ki, Li of the two largest classes ki,C is known (if not 
empty); 

(Hi) the number Ti of comparisons up to (and including) Si is less than or equal 
to -\- £i ~\~ ‘2i7ii — 2. 

Proof. Proof by induction. After the first ball has been handled, Ai = (1,0,0), 
Ti = 0<2-l-|-0-l-|-0-l — 2, All = bi and Li is unknown as the class is 
empty. Let 1 < i < n. Suppose Ki and Li are the representatives of ki and £i 
respectively and that bi^i is handled. Conditions (i),(ii) are clearly preserved if 
Paul uses the following strategy. 



~ ki > £i > rrii : 



bi+i : Li 
— li ki > £i = rrii : 



if YES Sj+i = {k^, ii T 1, mi) 



if NO 




if YES Si+i = 
if NO S'j+i = 



{ki + l,£i,mf) 

{ki,£i,mi + 1 ) 



f if YES Sj+i = {h + l,£^,£i) 

( if NO 5^+1 = {ki, £i + 1, £i), Li+i = bi^i 



— li ki = £i then £i > mi (otherwise finished by (A)): 



t'i+l 



: K, 



if YES Ai+i = {ki + 1, h, mi) 

if YES Si+i = {ki + 1, fcj, mi) 
if NO bi^i : Li ^ Ki.^1 — bi.\.±, Lij,.^ — Ki 

if NO Si+i = {ki,ki,mi + 1) 



Unless differently stated Ali+i = Ki and Li+i = Li. 

As for condition (iii), observe that is equal to Ti plus one or two, ac- 
cording to the number of comparisons Paul did. The proof follows by induction. 

Let, for example, ki > £i > mi and assume has the same color of Li, 
so that = {ki,£i + I, mi). Then Ti^i = Ti + I < 2ki + £i + 2mi — 1 = 
2ki + {£i -I- 1) -I- 2mi — 2 = 2fci+i -|- -|- 2mi+i — 2. All the other cases can be 

proven analogously. □ 
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Claim. One of (A), (_B), (C) eventually occurs. 

Proof. At state S'!, we have = n — 1 > 3 as fci = 1, = 0 and n > 4. 

Every time a ball is handled ti changes by 0,-1 or —2. In fact ti+i — ti = 
— 1 — (fci+i — ki) + {ii+i — li) and only the cardinality of exactly one of the three 
color classes is increased by one. When i = n then + 1 < 1 and 

hence {B) or (C) must occur. □ 

Let (fc, to) be the state at the end of Phase I, with K and L representatives 
of the two largest color classes (if not empty), r remaining balls, t = r—{k — £— 1) 
and 



T<2k + e+2m-2 (1) 

n = k + £ + m + r. (2) 



Phase II. Paul acts differently depending on how Phase I ended. 

Case 1: (A) occurred first. 

This means that k = i = m and that the total number of comparisons done in 
Phase I is r < 5fc — 2, by (1). 

If r = 0, then there are no remaining balls and Paul learned that the three 
color classes have the same cardinality. Paul wins the game stating there is no 
plurality. Hence, as k = n/3 concerning the total number of comparisons we 
have 



L{n) < T < 5k — 2 = -n — 2 . 

o 

If r > 0 the plurality among the n balls is the plurality among the r = n — 2>k 
remaining balls. As r < n, by induction, Paul wins the game using 5r/3 — 2 extra 
comparisons. Hence 



L(n)<T+"^-2<5fc-2+5A^-2=5!-4 

^ ^ - 3 “ 3 3 



Case 2: (B) occurred first. 

Paul wins the game claiming that K is of the plurality color. In fact, t = r — 
{k — £—l) = 0 means k = £ + r+l and even if all remaining balls have the same 
color as L, there still is one more ball colored as K. Hence K is the plurality 
color. 

To count the number of comparisons used by Paul observe that 



k = £ + r + 1 

= £ + n — k — £ — m+1 by (2) 

= n — fc — TO+1 , 



and 
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‘ik = k + {^ + r + l) + m+{^ — m) + r + l = n + r + {£ — m) + 2 . 

Suppose r = 0, then i > m. Because if £ = m, then the terminal state is 
{k,k — l,k — 1) and thus the previous state was either {k — 1, k — 1, k — 1) and 
the game would have finished by (A), or (k,k — l,k — 2) and the game would 
have finished by (C). Hence ma,x{r,£ — m} > 1, and so 3fc > n + 3 implying 
k > n/3 + 1. 

It follows that 

L(n) < T < 2k + £ + 2m — 2 
= 2n-£-2r-2 
= 2n — (£ + r + l) — r — I 
= 2n — k — r — 1 

< bn/2, - 1 - r - 1 

< 5n/3 - 2 . 

Case 3: (C) occurred first. 

We have that t = r — {k — £ — 1) = 1 if and only iik = £ + r and, hence, K is of 
the plurality color unless all the r remaining balls have the color of L (or M if 
£ = m) or unless there are no remaining balls. 

If r = 0 then k = £ > m and the game ends with Paul claiming that 
there is no plurality. To bound the total number of comparisons, observe that 
n=k + £ + m = 2k + m<3k and hence k > n/3. We have 

L{n) < T < 2k + 2m — 2 

= 2n — k — 2 by (2) 

< 5n/3 - 2 . 

If r > 1, Paul takes a ball R from the remaining balls and compares it to the 
other r — 1 balls. As soon as Carol answers NO, Paul wins the game claiming K 
is of plurality color. If Carol always answers yes then Paul wins using one last 
comparison. 



by (1) 
by (2) 

because t = 0 
because k > n/3 + 1 



If £ = to: 

^ ^ ^ f if YES K is of plurality color 
1 if NO there is no plurality 

If £ > to: 

^ ^ ^ f if YES there is no plurality 
( if NO AT is of plurality color 

Altogether, the total number of comparisons is L{n) <T + r. As n = k + £ + 
TO + r = 2fc + TO < 3fc we have k > n/3 and so 
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L{n) <T + r 

<2k + £ + 2m — 2 + r by ( 1 ) 

= 2n — £ — 2r — 2 + r by ( 2 ) 

= 2 n — k — 2 < 5 n /3 — 2 . 

□ 



3 The Lower Bound 

In this section we show a 3[n/2j — 2 lower bound for the plurality problem with 
three colors, red, blue and green (r, 6 and g for short). For sake of presentation, 
we will first assume that n is even and then explain how to derive the same 
bound also in the case n is odd. 

Any algorithm used by Paul can be seen as a sequence of steps in which 
Paul selects a pair of balls x,y and receives from Carol the answer yes or NO 
respectively meaning that x and y are colored with the same color or not. 

During the game, Carol builds a graph H = (V,E) {Carol’s graph), where 
each node in P C [n] = {1, . . . ,n} represents a ball that Paul involved in at 
least one comparison, and {x, y) € E if and only if Paul asked to compare x and 
y, where the edges are labeled with yes or no according to the answers Carol 
gave. The edges of H will be called YES-edges or NO-edges if they are labeled 
with YES or NO, respectively. Moreover, by Hy and Hm we denote respectively 
the graph induced by the set Ey of YES-edges and the set if at of NO-edges of H. 
Assume n is even, unless differently specified. 

Definition 1 . A graph H is said to be nice, if it satisfies the following properties: 

— Hfq = (Ai U S2, Eff) is a bipartite graph, P = Ai U S2, SiC\ 82 = 0 ; 

— jAil < n /2 and |S'2| < n/2; 

— Hy has no edge connecting a node x € Si with a node y G S'2. 

Let us show by induction that Carol has a strategy such that, at each step 
of any algorithm chosen by Paul, Carol’s graph H is nice. 

At the beginning of the game, Carol’s graph is empty and thus trivially nice. 
Therefore, assume that Carol has a nice graph H = (Si VJ 82, E). 

Let X, y be the pair of balls selected by Paul at the new step. Carol has to 
deal with one of the following cases. 

Case 1 : x G P and y G [n] \ P. 

Suppose w.l.o.g. that x G ^i. If |S'2| < n/ 2 , then Carol adds y to 82 and 
answers NO. If |S'2| = n/ 2 , then it must be lAil < n/ 2 . In this case Carol adds y 
to Si and answers yes. 

In both cases the new graph H = {V U {y}, if U {(a;, y)}) is nice according to 
the new partition given by sets ^i, S'2U{y} in the former case, and by S'iU{y}, 82 
in the latter. 

Case 2 : x, y G [n] \ P. 

If lAil < n /2 and |S'2| < n/ 2 , Carol adds x to and y to 82 and answers 
NO. Otherwise, suppose w.l.o.g., that lAil = n/ 2 . Then it must be |S'2| < n /2 — 2 
and Carol adds x and y to 82 answering yes. 
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In both cases the new graph H = {V VJ {x, y}, E U {{x, y)}) is nice according 
to the new partition given by sets Si U {a;}, S 2 U {y} in the former case and by 
S'!, S '2 U {x, y} in the latter. 

Case 3: x,y € V. 

If a: G S'! and y G S 2 , then Carol answers NO, otherwise she answers yes. 
Therefore, in any case the new graph H = (V, E\J {{x^ y)}) is nice according 
to the partition sets Si and S 2 - 

Since we have shown that Carol has a strategy that allows her to maintain a 
graph that is nice, in the following we will always assume that Carol’s graph is 
nice. Observe that Carol is always guaranteed that 

>max{|5i|,|52|} . (3) 

In fact, any new node inserted in El is inserted with a new NO-edge incident 
on it, unless maxdS'ij, |S' 2 |} is already n/2. 

In the following we will say that a nice graph admits a coloring if the coloring 
is consistent with the labelling of yes and NO edges. 

Lemma 1. Let L[ = (S'! U 52, E) he Carol’s graph at the end of the game. Paul 
wins the game only if Si and S 2 are yes -eomponents of eardinality nj2 each. 

Proof. In order to prove the lemma, we will show that if Si and S 2 are not yes- 
components of cardinality n/2 each, then whenever Paul claims that there is no 
plurality, Carol is able to show that P[ admits a coloring c having a plurality 
color. On the other hand, whenever Paul indicates that u is of plurality color, 
Carol is able to show that H admits another coloring c in which c{u) is not the 
plurality color. In the following, given a set S C V and a color col G {r,b,g}, 
c{S) = col means that all the balls in S are colored with col. 

Assume first that min{|5i|, |52|} = |5i| < n/2. Let Vi, V 2 C P be two disjoint 
sets of nodes such that P 1 UV 2 = P\(5iU52) and |Vj | + |5j| = n/2, for j G {1,2}. 
Of course, \Vi \ >0 and IP 2 I > 0- 

If Paul claims that there is no plurality or if he claims that n G 5i is of the 
plurality color, Carol shows the coloring c such that c(5i) = r, c(Vi) = b and 
c(52 U V 2 ) = g. Graph H admits c but in c there is a plurality color, different 
from c(u). 

If Paul claims that n G ^2 is of the plurality color, Carol shows the coloring 
c such that c(5i U Pi) = r and 0(^2 U P 2 ) = g. It is easy to see that H admits c, 
but c has no plurality color. 

In any case Paul is wrong. 

Therefore, we can assume |5i| = |52| = n/2. To prove that 5i and S 2 have 
to be YES-components, we can proceed analogously, assuming there is a third 
YES-component that plays the role of Pi . 

□ 

Theorem 2. To solve the plurality problem with 3 colors, Paul needs at least 
3n/2 — 2 comparisons in the worst case. 

Proof Let H = (5i U S 2 , E) be Carol’s graph at the end of a game Paul won. 
Then by Lemma 1 Si and S 2 are YES-components of cardinality n/2 each. Thus, 
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the number of YES-edges in each YES-component is at least n/2 — 1. From (3) it 
follows that the number of NO-edges in H is at least n/2. 

The number of comparisons used by Paul is the number of edges in H, that 
is, the number of edges in Hy plus the number of edges in H^, i-e., 3n/2 — 2. 

□ 

Let us now see how to derive the same lower bound in the case n is odd. 
When n is odd, Carol cannot generalize the strategy she used for the case n 
even by just building a nice graph in which has cardinality [n/2j and S 2 
has cardinality |"n/2] (or vice-versa). In fact, once Paul has a YES-component 
of cardinality |"n/2], he wins the game by claiming that the color of the nodes 
in that YES-component is the plurality color. The point is that Paul can build a 
YES-component of [n/2] nodes using only 2 [n/2] = n -I- 1 comparisons. 

Hence Carol’s strategy has to be slightly modified. As in the case n even, 
she builds a nice graph H where the cardinality of sets Si and S 2 is bounded 
by [n/2] . When Paul involves the last node, say I, in a comparison for the first 
time, Carol puts I in a third set S 3 and answers that the two nodes have different 
colors. In the sequel, whenever I will be involved in a comparison, Carol will say 
that the two nodes have different colors and will label all edges incident in I 
with NN. Such edges are called NN-edges and the set of all NN-edges is denoted 
by Enn- 

Let H = {Si U S '2 U {1},E) be Carol’s graph at the end of the game and 
assume that Si contains ki YES-components, for f = 1,2. 

It is clear that “no plurality” is always possible by coloring Si red, S '2 blue 
and I green. Hence since Paul wins he must be able to exclude the possibility 
that there is a plurality. From this we conclude: 

(1) I must be connected to Si and S 2 . Otherwise, if e.g., I is not connected 
to Si, c(Si U {1}) = r, c(S 2 ) would be a plurality coloring. 

(2) If fci > 3 (i = 1, 2), then I must be connected to every YES-component of 
S,. 

Otherwise, if C C Si is a component not connected to I then c(Si \C) = r, 
c(S 2 ) = b c(C U {1}) = g would give a blue plurality. 

It follows that I is connected by at least ki — 1 edges to Sj. With \Ej^\ > [n/2] 
(as in the case when n is even) we have that 



L{n) > iF/jvl + lAvl-l-lLlAfAf I ^ [n/2j-|-2[n/2j — ki — k 2 ~\~{ki~\-k 2 — 2) — 3[n,/2j — 2 . 

This concludes the proof of Theorem 2 both for n even and odd. 

It is straightforward to see that, using the same argument, the same lower 
bound for the plurality problem with 3 colors can also be derived for the plurality 
problem with any number of colors greater than 3. That is, we can state the 
following more general result. 

Theorem 3. To solve the plurality problem with > 3 colors, Paul needs at least 
3n/2 — 2 comparisons in the worst case. 

□ 
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Abstract. The /r-calculus model-checking problem has been of great interest in 
the context of concurrent programs. Beyond the need to use symbolic methods 
in order to cope with the state-explosion problem, which is acute in concurrent 
settings, several concurrency related problems are naturally solved by evaluation 
of /i-calculus formulas. The complexity of a naive algorithm for model checking 
a /i-calculus formula tp is exponential in the alternation depth d of ip. Recent 
studies of the /t-calculus and the related area of parity games have led to algorithms 
exponential only in No symbolic version, however, is known for the improved 
algorithms, sacrificing the main practical attraction of the /i-calculus. 

The /i-calculus can be viewed as a fragment of first-order fixpoint logic. One of 
the most fundamental theorems in the theory of fixpoint logic is the Collapse 
Theorem, which asserts that, unlike the case for the /i-calculus, the fixpoint alter- 
nation hierarchy over finite structures collapses at its first level. In this paper we 
show that the Collapse Theorem of fixpoint logic holds for a measured variant of 
the /i-calculus, which we call /i ’^-calculus. While /i-calculus formulas represent 
characteristic functions, i.e., functions from the state space to {0, 1}, formulas 
of the /I ’^-calculus represent measure functions, which are functions from the 
state space to some measure domain. We prove a Measured-Collapse Theorem’. 
every formula in the /t-calculus is equivalent to a least-fixpoint formula in the /i’^- 
calculus. We show that the Measured-Collapse Theorem provides a logical recast- 
ing of the improved algorithm for /i-calculus model-checking, and describe how 
it can be implemented symbolically using Algebraic Decision Diagrams. Thus, 
we describe, for the first time, a symbolic /i-calculus model-checking algorithm 
whose complexity matches the one of the best known enumerative algorithm. 



1 Introduction 

The modal fx-calculus, often referred to as the “/t-calculus”, is a propositional modal 
logic augmented with least and greatest fixpoint operators. It was introduced in [22], 
following earlier studies of fixpoint calculi in the theory of program correctness [11, 
31,32]. Over the past 20 years, the /i-calculus has been established as essentially the 
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“ultimate” program logic, as it expressively subsumes all propositional program logics, 
including dynamic logics such as PDL, process logics such as YAPL, and temporal logics 
such as CTL* [13]. The /r-calculus has gained further prominence with the discovery 
that its formulas can he evaluated symbolically in a natural way [6], leading to industrial 
acceptance of computer-aided verification. 

A central issue for any logic is the model-checking problem: is a given structure 
a model of a given formula. For modal logics we ask whether a given formula holds 
in a given state of a given Kripke structure. The /x-calculus model-checking problem 
has been of great interest in the context of concurrent programs. A significant feature of 
expressing model checking in terms of the p-calculus is that it naturally leads to symbolic 
algorithms, which operates on sets of states, and can scale up to handle exceedingly large 
state spaces [28]. Beyond the need to use symbolic methods in order to cope with the 
state-explosion problem [6], which is acute in concurrent settings, several concurrency- 
related problems are naturally solved by evaluation of /i-calculus formulas. This includes 
checks for fair simulation between two components of a concurrent systems [14,16] and 
reasoning about the interaction between a component and its environment, which is 
naturally expressed by means of parity games [8] (solving parity games is known to be 
equivalent to /x-calculus model checking [12]). Indeed, the model-checking problem for 
the /i-calculus has been the subject of extensive research (see [10] for an overview and 
[18,19,20,23,25,27,33] for more recent work). The precise complexity of this problem 
has been open for a long time; it was known to be in UPflco-UP [19] and PTIME-hard 
[25]. 

From a practical perspective, the interesting algorithms are those that have time 
bounds of the form where n is the product of the size of the structure and the 

length of the formula, and d is the alternation depth of the formula, which measures 
the depth of alternation between least fixpoint and greatest fixpoint operators. A naive 
algorithm would have d as the exponent, since alternating fixpoints of depth d yield nested 
loops of depth d, each of which involves n iterations. This naive algorithm uses space 
0(dn) [13]. The alternation depth is interesting as a measure of syntactic complexity, 
since, on one hand, many logics can be expressed in low-alternation-depth fragments 
of the /i-calculus [13,12], and, on the other hand, the /t-calculus alternation hierarchy 
is strict [4]. As noted, the naive algorithm can be naturally implemented in a symbolic 
fashion, operating on sets of states. 

The first improvement to the naive approach was presented in [27] (and slightly im- 
proved in [33]), who got the exponent down to d/2 at the cost of exponential worst-case 
space complexity. It was then shown By Jurdzinski [20] how to obtained the improved 
exponent together with the 0{dn) space bound. Common to these algorithms is the elim- 
ination of alternating fixpoints; they use monotone fixpoint computation that simulates 
the effects of alternating fixpoints by means of so-called progress measures. Progress 
measures are functions that measure the progress of a computation; see [21,30,24] for 
other applications. While the improved algorithms have better time complexity, they 
sacrifice fhe main practical attraction of /{-calculus - these algorithms are enumerative 
and no symbolic version of them is known. 

It is well known that modal logic can be viewed as a fragment of first-order logic 
[2]. Thus, the /x-calculus can be viewed as a fragment of first-order fixpoint logic, often 




524 



D. Bustan, O. Kupferman, and M.Y. Vardi 



referred to as “fixpoint logic”, which is the extension of first-order logic with least and 
greatest fixpoint operators. Fixpoint logic has been the subject of extensive research in the 
context of database theory [1] and finite-model theory [9]. One of the most fundamental 
theorems in the theory of fixpoint logic is the Collapse Theorem, which asserts that, 
unlike the case for the /i-calculus, the fixpoint alternation hierarchy over finite structures 
collapses at its first level; that is, every formula in fixpoint logic can be expressed as 
a least-fixpoint formula [15,17,26]. The key to this collapse is the simulation of the 
effect of alternating fixpoints by means of so-called stage functions, which measure the 
progress of fixpoint computations. 

Our main result in this paper is the unification of these two disparate lines of research. 
We show that the Collapse Theorem of fixpoint logic can be adapted to the /x-calculus. 
Both progress measure and stage functions measure the progress of fixpoint compu- 
tations. The key difference between fixpoint logic and the /i-calculus is that while in 
fixpoint logic progress measures can be constructed within the logic (by means of the 
Stage-Comparison Theorem [29]), this cannot be done in the /i-calculus [4], since it 
allows fixpoint operators only on unary predicates. In order to simulate the construction 
of progress measures within the /{-calculus, we define the /{^-calculus. While in the p- 
calculus variables represent characteristic functions, i.e., functions from the state space 
to {0, 1}, in the /i"^ -calculus variables represent measure functions, which are functions 
from the state space to some measure domain. We then prove a Measured-Collapse 
Theorem: every formula in the /i-calculus is equivalent to a least-fixpoint formula in the 
/{^-calculus. 

We then show that the Measured-Collapse Theorem provides a logical recasting of the 
improved algorithm in [20]. By starting with a /i-calculus formula of alternation depth d, 
collapsing it to a least- fixpoint /i"^ -calculus formula with measure domain {0, . . . 
and then computing the least fixpoint, we get the improved exponent of d/2 together with 
the 0{dn) space bound. Furthermore, this logical recasting of the algorithm suggests 
how it can be implemented symbolically. A symbolic evaluation of /i-calculus formulas 
uses Binary Decision Diagrams [5] to represent characteristic functions [6]. For the 
calculus, we suggest representing measure functions by Algebraic Decision Diagrams, 
which extend Binary Decision Diagrams by allowing arbitrary numerical domains [7]. 
Thus, we describe, for the first time, a symbolic /i-calculus model-check algorithm 
whose complexity matches the one of the best known enumerative algorithm. In fact 
as detailed in Section 4, working with /i"^ -calculus enables us to decrease the bound of 
the number of iterations needed for the simultaneous calculations, leading to a slightly 
better complexity. 



2 Preliminaries 

The p-calculus is a propositional modal logic augmented with least and greatest fixpoint 
operators [22]. We consider a /x-calculus where formulas are constructed from Boolean 
propositions with Boolean connectives, the temporal operators <C> (“exists next”) and □ 
(“for all next”), as well as least (p) and greatest (u) fixpoint operators. We assume that 
/i-calculus formulas are written in positive normal form (negation only applied to atomic 
propositions). 
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Formally, let AP be a set of atomic propositions and let AT be a set of variables. 
The set of /i-calculus formulas over AP and X is defined by induction as follows. (1) 
If p G AP, then p and -ip are /i-calculus formulas. (2) If x G X, then a; is a /t-calculus 
formula (in which x is free). (3) If ip, ip, are /i-calculus formulas, then ip\/ 'ijj,ipA ip,(}ip, 
and □ (f are /t-calculus formulas, (4) If x G X, then px.ip and lyx.ip are /t-calculus 
formulas (in which x is hound). The semantic of /t-calculus is defined with respect to a 
Kripke structure M = {S, R, L), and an assignment f \ X ^ 2^ fo the variables. Let 
T denote the set of all assignments. For an assignment f G P, a. variable x G X, and a 
set S' C S, we use f\x=S' to denote the assignment in which x is assigned S' and all 
other variables assigned as in /. A formula is interpreted as a function : P ^ 2^ . 
Thus, given an assignment f G P, the formula ip defines a subset of states that satisfy ip 
with respect to this assignment. For a definition of the function ipi^ see the full version 
or [22]. When M is clear from the context, we omit it). A formula with no free variables 
is called a sentence. Note that the assignment / is required only for the valuation of the 
free variables in ip. In particular, no assignment is required for sentences. For a sentence 
Ip, we say that M, s ^ if s G ip^{f), for (the arbitrarily chosen) / with f{x) = 0 for 
all X G A. 

Let A denote /t or v. We assume that every variable x G X is bound at most once. We 
refer to the fixpoint suhformula in which x is bound as A(x) . If A = /(, we say that a; is a /t- 
variable, and if A = i^, we say that it is a i/-variable. Consider a /t-calculus formula of the 
form Ax. (/). Given an assignment / G iF, we define a sequence of functions (/?-^(/) : 2^ -A 
2'^ inductively as follows. /?°(/)(S") = S" and v3-’+H/)(>5") = 'F(/U=vH/)(S'))- 
/i-calculus formula and a subformula = Ax.A(x) of/>, we define the a/fernahon /eve/ 
of ip in Ip, denoted al.^{ip), as follows [3]. If is a sentence, then al.^{ip) = 1. Otherwise, 
let / = X'y.pp be the innermost porv suhformula of ip that has ip as a suhformula, and y 
is free in (/). Then if A' ^ A, we have a/^ (</)) = a/^(/) -f 1. Otherwise, a/^((/>) = a/^(/). 

Intuitively, the alternation level of ip in ip is the number of alternating fixpoint opera- 
tors we have to “wrap ip with” in order to reach a suh-sentence of ip. For a variable x, the 
alternation level of x, denoted al{x) is the alternation level of A(x). Note that it may he 
that A(x) is a suhformula of A(x') and a/(x) = al{x'). The definition of a/(x) partitions 
X into equivalence classes according to the variable’s alternation level. Note that an 
equivalent class may contain variables that are independent. In order to refine the class 
further, we define the order -< to be the minimal relation that satisfies the following. (1) 
If x' is free in A(x) then x ^ x' . (2) If x < y and y < x' then x < x' . We define the « 
equivalence relation to he the minimal equivalence relation that contains all pairs (x, x') 
such that X < x' and al (x) = al{x'). The relation « refines the partition induce hy al (x) 
so that each class contains variables at the same alternation level that do depend on each 
other and are all are either /i variables or v variables. We define the width width{i) of 
an alternation level i as the maximal size of an equivalence class that is contained in the 
i’th alternation level. Another property of the « relation is that for every equivalence 
class A® there exists a unique variable x™ = mox(A®) in such that for every other 
variable x G A'^ we have x ^ Xm- We can simultaneously calculate the fixpoint values 
of all the variables that are in the same equivalence class. 

The reason that we use simultaneous fixpoint is that the evaluation of the variables of 
a /x-calculus formula as defined above is hierarchical, in the sense that in order to update 
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the value of a variable x, we first evaluate all the variables that appear in subformulas of 
A(a;). Since the value of x might be updated up to [S'! times, this makes the complexity 
of the evaluation exponential in the nesting depth of the fixpoint operators. It turns out 
that this hierarchal computation is needed only when there is alternation of /i and v 
variables. Thns, if \{x) is a subformnla of \{y) bnt x ^ y,w& can compute their value 
simultaneously. This could reduce the complexity substantially. 

Next, we define a simultaneous fixpoint operation over equivalence classes organized 
in tnples. Let be an equivalence class of variables with respect to «. Let X' be the 
set of variables {x'\3x G X^.x -< x'}, and let X" be the set {x"\3x G X^.x” -< x}. 
Let Xm = max{X^), then the subformula X{xm) = Xxm-^fm binds all variables of X®. 
Given an assignment / : X' — >■ 2'^ we consider (pm{f) as a function '■ (X® — >■ 

2®) — (X® — 2‘®). This function is used to define the simultaneous fixpoint value of X®. 
Note, that all the variables in (p^ are either in X" or in X' U X®. Given an assignment 
/ : X' ^ 2^, assume that an extension of / to recursively determines the 

values of the variables in X" or more precisely the values of the subformulas A(x") . Thus 
subformulas that are not determined in areofthetheform A(x') where x' G X'UX®. 
We determine these values using then for every variable x G X® we can 

calculate the value of and determine it’s new value. We define the simultaneons 
fixpoint value of X® as, : VmiDiS') C S'} for ^-class and 1J{S'' : S' C 

^m(f){S')} for :/-class. 

Theorem 1. For every variable x, the y-calculus and the simultaneous fixpoint assign 
the same value to x, 

Theorem 2. (Extended Knaster- Tarski) 

- = ‘p-(/)(^)} = u>o vm'ifw, 0, . . . , 0)) = 

y™^'^‘'(/)((0.0,^.,0)). _ 

- [j{S'\S' C Pm{f){S')} = ]J{S'\S' = PmifKS')} = n,>o^rn\f){{S,S,...S)) = 

3 The Logic ^t^-Calculus 

While a formula of the /i-calculus defines a subsef of S, namely a mapping from S to 
{0, 1}, a formnla of the -calculus defines a mapping from S' to a domain D where D 
is parameterized by a natural number k and a sequence of natural numbers no , ni , . . . 
such that L) = Uf=o({l’2, . . . , no} x {1, 2, . . . , m} x . . . x {1, 2, . . . ,ni})U{oo, -ooj. 
We start with the syntax of the ^^-calculus. As in the /x-calculus, formulas are defined 
with respect to a set AP of atomic and a set X of variables. In the ^^-calculus, however, 
each variable is associated with an arity. We write to indicate that variable x has 
arity c. Given AP and X, the set of the /x-calculns formnlas (in positive normal form) 
over AP and X is defined by induction as follows. 

- If p G AP, then p and -ip are /x^-calcnlus formulas. 

- If x*^®) G X, then a;^®^ is a /x^-calcnlus formula (in which x is free). 

- If p and Ip are /i-calcnlus formulas then 
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• 1 ^ V t/> and (fi Alp are /x#-calculus formulas, 

• <P>(f and □ ip are /i#-calculus formulas, 

• For € X, we have that and incx^^^^p are /x^-calculus formula 

(in which x is hound). 

We define an alternation level, a preorder and an equivalence relation « over X in 
the same way we define it for the /i-calculus. We say that a /x^-calculus formula is well 
formed if 

- The arity c of a set-variable is equal to the minimal arity of inc-variables with 
alternation level smaller than al{x). 

- The arity c of a inc-variable is equal to the minimal arity of set-variables with 
alternation level smaller than al{x), minus one. 

We use sub{ip) to denote all the subformulas of ip. Before defining the semantics of 
the -calculus, we define a parameterized order over the tuples in D. Intuitively, the 
order is lexicographic, and the parameter enables us to restrict attention to prefixes of 
the tuples. Formally, we have the following. 

Definition 1. For d,d' € D and I > 0, we say that d <i d' if either d' = oo and d oo, 
or d' —CO and d = — oo or d = {do, . . . ,di) and d' = {d'o, ■ ■ ■ , d'j), and either: 

- For some k < min(i, j, 1) we have dk < d'f. and for every 0 < m < k, dm = d'm- 

- i < min((,j) and for every k < i we have dk = d'f.. 

Definition 2. For d,d' € D and I > Q. we say that d =i d' if either d = d' or 
d = {do, ■ . ■ ,di) and d' = {d'o, ■ ■ ■ , d'j), and I < min(i, j) and for every k <lwe have 
dk = d'f.. 

Note that <; is a total order over the tuples with arity < 1. We sometimes use the 
order without the parameter, with the usual lexicographic interpretation. Thus, d < d' if 
d <i d' for I = max{|fi|, |(i'|}, and the minimum and maximum tuple of a set of tuples 
are defined similarly. 

For d = (dp , . . . ,di) and I > 0, let set;(d) be greatest (-tuple d' such that d' <i d. 
Ifd = ooord = — oo, then seti {d) = d. Also, let inc; (d) to be the smallest (-tuple d' in 
D such that d <i d' . Since <i is total, such a unique tuple exists. If d = {no, n\, . . . ,ni), 
then inc/(d) = oo, if d is oo then inc;(d) = d, and if d is — oo then inc/(d) is the 
(-tuple ( 1 , 1 ,..., 1 ). 

Consider a Kripke structure M = {S, R,L).A measure function for M is a function 
g : S ^ D. For c > 1, we say that 5 is a measure function of arity c if for all 
s G S', we have g{s) is either a c-tuple in D or an element of { 00 , — 00 }. The semantics 
of -calculus is defined with respect to a Kripke structure M = (S,R,L) and an 
assignment f : X ^ to the variables. An assignment / is legal if for all x^"'> € X, 
the measure function /(x) is of arity c. Let denote the set of all legal assignments. A 
formula ip is interpreted as a function ip^ : . Thus, given a legal assignment 

/ G the formula ip defines a measure function for M with respect to /. The function 
rp^ is defined, for all s G S, inductively as follows (when M is clear from the context, 
we omit it). 
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- p{f){s) =ooifp£ L{s) andp(/)(s) = — oo if p ^ ^(s)- 

- -•p{f){s) = ooifp ^ L{s) and p{f){s) = —oo ifp & L{s). 

- For a free variable we have x^'^^(/)(s) = /(a:*^°^)(s). 

- (<FVV')(/)(s) = max{(/?(/)(s),^/>(/)(s)}. 

- (<FA V')(/)(s) = min{t/?(/)(s),V’(/)(s)}. 

- (<>v){f){s) = max{(/?(/)(s') \R{s,s')}. 

- P‘P)if) =min{(/?(/)(s') |i?(s,s')}. 

- setx^‘^\p{f){s) = setc{p{f){s)). 

- incx^<^\p{f){s) = inca{p{f){s)). 

Let A denote set or inc. As in the /x-calculus, we assume that every variable G X 

is bound at most once in a /i "^-calculus formula, and refer to the subformula that bounds 

as A (a;). We can view a formula as a function f/j : Indeed, given/ G iF^, 

all the subformulas of ip, and in particular A(x) , for all x^’^i € X, are mapped into measure 
functions. Formulas of -calculus are monotone, in the sense that p{f) > f. Hence, 
we can talk about the least fixpoint of a /i# -calculus formula. 

Let : X ^ be the result of applying ip on the assignment go until a fixpoint 
is reached, where gg assigns to every variable x^<^\ the assignment S —>■ —oo. Every 
variable can be updated at most |S'| • ng • ni • . . . • times thus the time complexity 
is 0(1^1 • IS”! • no • ni • . . . • Uk) and the space complexity is 0(|X| • [S'! • (log{no) + 
log{ni) -I- ... -I- log{nk))). 

Given a /i-calculus formula ip, we associate with ip a -calculus formula ip"^ that 
characterizes the same set of states. We define ip"^ to be ip where the arity of a variable x 
is w{x) = [ 2^^] , every p operator is replaced by a set operator, and every v operator 
is replaced by an inc operator. In order to check whether a Kripke structure M satisfies 
ip"^, we define the domain D where k = j-pj. gygj-y Q < z < fc we 

have m = width{2 • i -f 1) • [S'!. 

Theorem 3. (Measured Collapse) Let ip be a p-calculus formula, and let M be a 
Kripke structure. Then, M, s \= ip iff g.^*{ip'^){s) = oo. 

The proof of Theorem 3 is described in the full version. Theorem 3 implies a simple 
model-checking algorithm for the /i-calculus. Given a /t-calculus formula ip and a Kripke 
structure M, translate ip into ip"^ and check whether M |= ip^. The time complexity of 
this algorithm is 0(1 AT I . .■width{2-k+f)-\S\^'^^) where /c is the 

maximum alternation level of ip. The space complexity is 0(|AT| • [S'! • {log{width{l)) + 
log{width{2)) -f . . . -f log{width{k)))) . Note that in the model-checking algorithm that 
uses a reduction to parity games, the time complexity is 0(|X| • |a((l)| • |a/(3)| • . . . • 
|a((2-fc + l)| • |5|'=+i). 

Recall that for all z, we have that width{i) < al{i). Thus, our complexity is better. 
The improved complexity follows from the fact that the reduction of /z-calculus model 
checking to parity games does not take into account the fact that some variables with 
the same alternation level may be independent of each other. On the other hand, the 
translation to /t"^ -calculus refines the partition induced by the alternating level to the 
relation «. 
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4 Symbolic -Calculus Model Checking and Parity Games 

As discussed in Section 1, the improved algorithms for /i-calculus model checking 
are not symbolic. In this section we describe a symbolic algorithm for -calculus 
model checking. The Measured Collapse Theorem then implies a symbolic algorithm 
for /i-calculus model checking, and our complexity matches the improved complexity of 
[20]. In addition, we show how the algorithm in [20], for the equivalent problem of solv- 
ing parity games, can be viewed as a computation of a least fixed-point over a measured 
domain, and describe a symbolic implementation for it that follows from this view. A 
symbolic evaluation of /t-calculus formulas uses Binary Decision Diagrams (BDDs) [5] 
to represent characteristic functions [6]. For the /t#-calculus, we use Algebraic Decision 
Diagrams (ADDs), which extend BDDs by allowing arbitrary numerical domains [7]. 

Symbolic evaluation of -calculus formulas. Consider a /i"^ -calculus formula ip and 

a Kripke structure M = {S, R, L). We define the product of ip and M as the graph 
= {V, E), where 

- V = sub{ip) X S. 

- s), {ip', s')) iff one of the following holds. 

• s = s' and there is ip" such that (p is ip' \/ p" , p" M p' ,p' A p" , or p" A p' . 

• R{s, s') and p is fyp' or □ p' . 

• s = s' and p is or \-n.cx^‘'\p' . 

• s = s', p = and p' = Xx^^'^p" . 

We refer to vertices of the form {p' V p" , s) or {'f'p' , s) as max vertices, and to vertices 
of the form {p' A p" , s) or (□ p' , s) as min vertices. 

Let gp, : sub{ip) -A be the least fixpoint of ip. We describe the calculation of 

by means of a function '. V ^ D such that f,p^M{^, s) = g.^{p){s). Note that for 
all p G sub{ip), we have that s ^ iff fy,^M{s, p) = oo. In order to calculate 
we describe a sequence of functions /o, /i, . . . such that f.^^M = fi where i is the least 
such that fi = fi+i. The functions fi'.V^D are defined inductively as follows. We 
start with /q. 

- If u = (p, s) then fo{v) = oo if s |= p and /o(u) = — oo if s ^ p. 

- If u = (-ip, s) then fo(v) = oo if s ^ g and fo(v) = — oo if s ^ g. 

- For all other vertices fo(v) = — oo. 

Given fi, we define as follows. 

- If u is of the form (p, s) or (-ip, s) then fi+i{v) — fi{v). 

- If u is a max vertex, then fi+i{v ) = max{/i(v') I {v,v') G E}. 

- If u is a min vertex, then fi+i{v) = min{/j(u') | {v,v') G E}. 

- If u is of the form {x*-"\s) then v has a single successor v' and fi+i{v) = fi{v'). 

- If u is of the form (set x^"\p, s), then v has a single successor v' and /i+i(u) = 

setc(/i(u'))- 

- If u is of the form (inc x^"\p, s), then v has a single successor v' and /i_|_i(u) = 
incc(/i(u'))- 
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Proposition 1. Consider a Kripke structure M and -calculus formula ■0. For all 
(fi € sub(pp) and s € S, we have g-,p{p){s) = s)- 

We now describe how to compute /^,m symbolically. We use BDDs to represent sets 
and relations, and use ADDs to represent measure functions. Consider a Kripke structure 
M = (S', R, L) and a formula 0. Let E) be their product as dehned above. 

We assume that M is given symbolically by one BDD hn for R, and \AP\ BDDs - one 
BDD hp for each p G AP, representing the set of states that satisfy p (when the state 
space is given by truth assignments to AP, there is no need for these BDDs) . Given 
these BDDs, constructing BDDs that represent V and E is straightforward. In particular, 
we assume that E is represented hy the BDD He, and we also have the following BDDs 
for subsets of C: a BDD Hap for vertices of the form {p, s) or (-ip, s), BDDs /imax 
and h-niin for the max and min vertices, respectively, a BDD hx for vertices of the form 
s) for some c, BDDs hsetj for vertices of the form (seta;^-'^.ip, s), and BDDs 
h±nc,j for vertices of the form {incx^^\ip, s). Finally, the procedure also gets an integer 
Cmax, which is the maximal arity of a variable in X. 

The algorithm for computing described in Figure 1 . Apart from the Boolean 

BDD operators OR, AND, and NOT, we use the operator — (h, d), which gets a BDD 
h C V and some d G D, and creates an ADD that maps all the elements of h to d, and 
the following procedures. 

- MAX, which given an ADD f : V ^ D and the BDD hp, returns an ADD that 
assigns to every vertex v G hmax the value ma,x{f{v')\E{v, n')}. 

- MIN, which given an ADD f : V D and the BDD hp, returns an ADD that 
assigns to every vertex v G hmin the value min{/(w')|i<^(n, w')}. 

- ASSIGN, which given an ADD f : V ^ D and the BDD hp, returns an ADD that 
assigns to every vertex v G hx the value f{v') for the single v' with E{v, v'). 

- SET(f , j), which given an ADD f : V ^ D, the BDD hp, and 1 < j < Cmax, 
returns an ADD that assigns to every vertex v G hget,j the value set j (/(«')) for 
the single v' with E{v, v'). 

- INC(f , j), which given an ADD f : V D, the BDD hp, and 1 < j < Cmax, 
returns an ADD that assigns to every vertex v G hmc,j the value incj (/(«')) for 
the single v' with E{v, v'). 

- OR between ADDs, which gets ADDs that map disjoint subsets ofV to D and returns 
their union (all the ADDs are defined for all the vertices in V, but some vertices are 
mapped to some special value, which enables us to represent by ADDs also partial 
functions). 

Since all procedures assign values to the vertices according their successors, it is 
useful to generate, given an ADD / and the BDD hp, the ADD fsuc '■ V x V ^ D 
such that fsuc{v, v') = d if E{v, v') and f{v') = d. If ~'E{v, v'), then fsuc{v, v') = oo. 
The ADD fsuc is simply the result of an AND operation on hp and a ADD of / with 
renamed variables. Using fsuc, the implementation of ASSIGN is straightforward as 
3v'.(/suc AND hx)- The implementation of I NC and SET is similar except that we replace 
every leaf d in the ADD of {fsuc AND h±nc ,j) or {fsuc AND ft-set.j) with incj{d) 
or setj(c?) respectively. The procedures MAX and MIN are more complicated and are 
described in the full version. 
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MODEL_CHECK 

hip = (0Rp6Ap(M AND hp))OR (OR pgap({-p} AND NOT (hp)); 
fl=^(hl,oo) ; 

f = fj OR ^{{hv AND NOTh*),-cxD); 
repeat 

fold = /; fmax = nAX(foid) ; 

fmin = MIN(/oid) ; fa: = ASSIGN(/„(d) ; 

/set = false ; /me = false ; 
for j = 1 to Cmai! do 

/set — /set OR SET{foid,j); fine — /inc OR INC(/„id,/) 
f = fq OR fmax OR fmin OR /set OR /inc ) 
until / = fold 

Fig. 1. The symbolic algorithm for -calculus model checking. 



Let us now analyze the complexity of the procedure. The number of iterations 
required for the procedure to reach a fixed point is hounded by \D\ ■ \V\ which is 
[S'! ^ • |S'| • \if\. Each iteration involves an applications of the MIN/MAX proce- 

dures (that are the most costly). In the full version, we show that these procedures apply 
at most • log{\V\) = (|S'| • It/I)^ • log{{\S\ ■ |t/|) ADD operations. Thus, the overall 
complexity is • I'i/P) ■ ^op((|S'| • |'0|) ADD operations. 



Parity games. A parity game is played on a graph {V, E), where V is partitioned into 
two sets: Vq of even vertices and Vi of odd vertices. Every vertex v has a priority 
p{v) € {0, 1, ... /c — 1}. A parity game over (Vb, Vi,E,p) is played by two players, 
referred to as the odd and the even player. A play over the game starts by putting a pebble 
at some initial vertex v and proceeds infinitely many rounds. In each round, one of the 
players moves the pebble on an edge from the current vertex to one of its successors. If 
the source vertex is in Vb, the even player moves the pebble; otherwise the odd player 
moves the pebble. The play generates an infinite sequence of vertices p. Let inf{p) be 
the set of vertices that appear infinitely often in p. The odd player wins the game if the 
vertex with minimal priority in inf(p) has an odd priority. Otherwise, the even player 
wins. The problem is to determine, given a game graph {Vo,Vi, E,p), the set of vertices 
from which the odd player has a winning strategy. 

In [20], an algorithm for solving parity games is suggested. Below, we describe the 

k 

algorithm in terms of measure function. Let D = U|=i{0) • • ■ ) 0 {oo, — oo}, 

let F be the set of all measure functions f : V ^ D and let /o be the initial function 
that assigns — oo to all vertices. A game graph G induces a function from F to F, where 
for a measure function f G F, the measure function G{f) is defined, for all v GV,a.s 
follows: 

max(„ f{v') ifvGVi and p{v) is even. 

max(„ if f G Vi andp(v) is odd. 

min(^ f{v') if w G Vb and p{v) is even. 

min(^_„qg£; inc |-pw (/(^''))) if w G Vb and p{v) is odd. 



G{f){v) = 
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If we denote by fa to the least fixpoint of G, then the set of winning vertices for the 
odd player is {v|/g(^’) = oo}, and the set of winning vertices for the even player is 
{’vlfciv) < oo}. The measure function fc can he used for generating a winning strategy 
7T : Vb — >■ y for the even player where for every u G Vb we have 7 t(u) = v' such that 
fcW) = min{fG{v")\{v, v”) G Thus, the even player moves to a successor of v 
with minimal measure. A symbolic procedure that generate a strategy is given in the full 
version. 

A symbolic implementation of the algorithm similar to the symbolic evaluation 
of /i"^-calculus formulas is described in Figure 2. The procedure calls the following 
procedures 

- MAXe, which given an ADD f : V ^ D, the BDD He, and an even 1 < j < 

returns an ADD that assigns to every vertex v G Vi with p{y) = j, the value 

max{/(u')|F^(u,u')}. 

- MAXo, which given an ADD f : V ^ D, the BDD He, and an odd 1 < j < |, 

returns an ADD that assigns to every vertex v G V\ with p{y) = j, the value 

max{inc|-|.|(/(u')) : E{v,v')}. 

- MINe and MINo, defined similarly for vertices in Vq. 

The symbolic implementation of these procedures is similar to the implementation of 
the MAX and MIN procedures of the former section, and is described in the next section. 

PARITY(G) 

/=^ CF,-oo); 
repeat 

foid = f; / = false; 
for jf = 1 to ^ do 

if j is even then / = / OR MAXe(/oid, j) OR MINe(/o(d, j) ; 
if j is odd then / = / OR MAXo(/oid, j) OR MINo(/oid, j) ; 
end for 
until f = fold; 



Fig. 2. A symbolic algorithm for solving parity games. 



Complexity: Similarly to the previous section, we can bound the number of iterations 
by I C I r 1 1 • I C I . Thus, the overall complexity isOdCjl^^^+^TogdCD) ADD operations . 



References 

1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of databases. Addison- Wesley, 1995. 

2. J. F. A. K. van Benthem. Modal Logic and Classical Logic. Bibliopolis, Naples, 1985. 

3. G. Bhat and R. Cleaveland. Efficient local model-checking for fragments of the modal /i- 
calculus. In Proc. TACAS, LNCS 1055, 1996. 

4. J.C. Bradfield. The modal /r-calculus alternation hierarchy is strict. TCS, 195(2): 133-153, 
1998. 




A Measured Collapse of the Modal /r-Calculus Alternation Hierarchy 533 



5. R.E. Bryant. Graph-based algorithms for boolean-function manipulation. IEEE Trans, on 
Computers, C-35(8), 1986. 

6. J.R. Burch, E.M. Clarke, K.L. McMillan, D.L. Dill, and L.J. Hwang. Symbolic model check- 
ing: 10^° states and beyond. I&C, 98(2): 142-170, 1992. 

7. R. Bahar, E. Frohm, C. Gaona, G. Hachtel, E. Macii, A. Pardo, and F. Somenzi", Algebraic 
decision diagrams and their applications. FMSD, 10(2/3): 171-206, 1997 

8. A. Chakrabarti, L. de Alfaro, T.A. Henzinger, M. Jurdzinski, and F.Y.C. Mang. Interface 
compatibility checking for software modules. In I4th CAV, LNCS 2404, pp. 428^41, 2002. 

9. H.D. Ebbinghaus and J. Flum. Finite Model Theory. Perspectives in Mathematical Logic. 
Springer- Verlag, 1995. 

10. E.A. Emerson. Modal Checking and the /r-CalcuIus, Descriptive Complexity and Finite 
Morfe/i, American Mathematical Society, pp. 185-214, 1997. 

11. E.A. Emerson and E.M. Clarke. Characterizing correctness properties of parallel programs 
using fixpoints. \nProc. 7th ICALP,pp. 169-181, 1980. 

12. E.A. Emerson, C. Jutla, and A.P. Sistla. On model-checking for fragments of /i-calculus. In 
Proc. 5th CAV, LNCS 697, pp. 385-396, 1993. 

13. E.A. Emerson and C.-L. Lei. Efficient model checking in fragments of the propositional 
/i-calculus. In Proc. 1st TICS, pp. 267-278, 1986. 

14. K. Etessami, Th. Wilke, and R. A. Schuller. Fair simulation relations, parity games, and state 
space reduction for Biichi automata. In Proc. 28th ICALP, LNCS 2076, pp. 694-707, 2001. 

15. Y. Gurevich and S. Shelah. Fixed-point extensions of first-order logic. Annals of Pure and 
Applied Logic, 32:265-280, 1986. 

16. T.A. Henzinger, O. Kupferman, and S. Rajamani. Fair simulation. I&C, 173(1):64-81, 2002. 

17. N. Immerman. Relational queries computable in polynomial time. l&C, 68:86-104, 1986. 

18. M. Jurdzinski J. Voge. A discrete strategy improvement algorithm for solving parity games. 
In E. A. Emerson and A. P. Sistla, editors, Proc 12th CAV, LNCS 1855, pp. 202-215, 2000. 

19. M. Jurdzinski. Deciding the winner in parity games is in UP n co-UP. IPL, 68(3):119-124, 
1998. 

20. M. Jurdzinski. Small progress measures for solving parity games. In Proc. 17th TACAS, 
LNCS 1770, pp. 290-301, 2000. 

21. N. Klarlund. Progress measures for complementation of oj-automata with applications to 
temporal logic. In Proc. 32nd FOCS, pp. 358-367, 1991. 

22. D. Kozen. Results on the propositional /i-calculus. TCS, 27:333-354, 1983. 

23. O. Kupferman and M.Y Vardi. Weak alternating automata and tree automata emptiness. In 
Proc. 30th STOC, pp. 224-233, 1998. 

24. O. Kupferman and M.Y. Vardi. Weak alternating automata are not that weak. ACM ToCL, 
2001(2):408^29, 2001. 

25. O. Kupferman, M.Y. Vardi, and P. Wolper. An automata-theoretic approach to branching-time 
model checking. / ACM, 47 (2):3 12-360, 2000. 

26. D. Leivant. Inductive definitions over finite structures. I&C, 89:95-108, 1990. 

27. D. Long, A. Brown, E. Clarke, S. Jha, and W. Marrero. An improved algorithm for the 
evaluation of fixpoint expressions. In Proc. 6th CAV, LNCS 818, pp. 338-350, 1994. 

28. K.L. McMillan. Symbolic Model Checking. Kluwer Academic Publishers, 1993. 

29. Y. N. Moschovakis. Elementary Induction on Abstract Structures. North Holland, 1974. 

30. Klarlund N and FB . Schneider. Proving nondeterministically specified safety properties using 
progress measures. l&C, 107(1):151-170, 1993. 

31. D. Park. Finiteness is /r-ineffable. TCS, 3:173-181, 1976. 

32. VR. Pratt. A decidable /i-calculus: preliminary report. In 22nd FOCS, pp. 421-427, 1981. 

33. H. Seidl. Fast and simple nested fixpoints. IPL, 59(6):303-308, 1996. 




An Information Theoretic Lower Bound for 
Broadcasting in Radio Networks 



Carlos Brito, Eli Gafni, and Shailesh Vaya 

Department of Computer Science 
University of California Los Angeles 
Los Angeles - 90049 
{f isch.eli , vaya}@cs .ucla.edu 



Abstract. We consider the problem of deterministic broadcasting in 
undirected radio networks with limited topological information. We show 
that for every deterministic protocol there exists a radius 2 network which 
requires at least rounds for completing broadcast. The previous 

best lower bound for constant diameter networks is n{n^) rounds, due 
to [23]. For networks of radius D the lower bound can be extended to 
n{{nD)^) rounds. This resolves the open problem posed by [23]. 

Of perhaps more interest is our approach for proving the lower bound 
which is novel. We quantify the amount of connectivity information, 
about the topology of the network, that the source can learn in arbitrary 
number of rounds of an a deterministic broadcasting protocol. This ap- 
proach is much more intuitive and exposes the structure of the broad- 
casting problem. We believe it is of independent interest and may have 
other applications. 



1 Introduction 

Ad-hoc wireless networks have been the subject of extensive research in recent 
years. These networks find potential applications in scenarios such as battlefields, 
emergency disaster relief, and situations in which it is very difficult to provide 
the necessary infrastructure. Techniques and algorithms developed for radio net- 
works also find applications in the ever growing field of wireless computing. 

Communication in ad-hoc wireless networks is structured using synchronous 
time-slots. A global clock, which indicates the current round number, is provided 
to all the nodes in the network. At every round each node acts either as a trans- 
mitter or a receiver. A radio network can be modeled as an undirected connected 
graph as follows. Each node in the graph represents a processor, and two nodes 
are connected by an edge if the coresponding processors lie within the transmis- 
sion range of each other. A message transmitted by a node can potentially reach 
all its neighbors. However, if more than one neighbor sends a message in the same 
round, then a collision occurs and the node does not receive anything. A node 
cannot distinguish between a collision and silence, that is, a node cannot decide 
if none or more than one of its neighbors have transmitted in a given round. We 
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will see in the next sections that these restrictions on the communication have a 
severe impact on the design and performance of algorithms for radio networks. 

This paper is about broadcasting in radio networks. Broadcasting is one of 
the most studied problem in radio networks. It consists of a task initiated by a 
single processor, called source, whose goal is to send a single message to all the 
processors in the network. The complexity of a broadcasting protocol is measured 
by the number of rounds it requires to deliver the broadcast message to all the 
nodes in the network. The following section reviews some to the important results 
for this problem. 



1.1 Related Work 

The study of broadcasting in radio networks whose nodes have only limited 
knowledge of the topology, was initiated by the seminal work of Bar- Yehuda, 
Goldreich, Itai [3]. In the framework considered by them, nodes know only their 
own label and the labels of their neighbors. Under this assumption, a simple 
linear time broadcasting algorithm exists based on distributed depth-first search 
[2]. Reference [3] also claims a lower-bound of I7(n) for constant diameter net- 
works. However, as observed by [23], this result does not hold for the stated 
model, but seems to hold for a related, more restricted model [26] . 

For the original model, Kowalski, Pelc [23] established a lower bound of 
for networks of diameter 4 (and, for diameter D). 

A weaker setting, where nodes know their own labels, but do not know the 
labels of their neighbors, has also been studied. Research in this problem has 
led to the introduction of an interesting combinatorial concept called selective 
family. The use of selective families in the design of deterministic protocols for 
unknown networks was introduced by Chelbus et. al. in [9]. Several recent works 
exploit this combinatorial tool, specifically the use of probabilistic method, for 
obtaining good lower and upper bounds for the broadcasting problem [12,9,11]. 

2 Discussion of Results and Techniques 

The main result of this work is a lower bound on the number of rounds required 
by any deterministic protocol to broadcast a message in a radio network, where 
nodes have limited knowledge about the topology and the underlying graph is 
undirected. The result is formally stated in the next theorem, and holds for the 
same model studied in [3], [23]. 

Theorem 1. Every deterministic protocol requires at least rounds in the 
worst case to complete broadcast in a radio network. 

Besides the lower bound provided in theorem 1, which considerably improves 
on previous results, another important contribution of the paper is a set of 
novel techniques that could be useful in the analysis of related problems. In the 
following we discuss some subtle aspects of the problem of broadcasting in radio 
networks, that motivate our ideas. 
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Consider the simple class of networks illustrated in Figure 1, which can be 
partitioned into 3 layers: (a) layer Lq consisting only of the source node; (b) 
layer Li composed by n nodes; and, (c) layer L 2 consisting only of a single node, 
say V. Communication can only occur between the source and each node in layer 
L\, and between an arbitrary set of nodes in L\ and the node v in L 2 - Following 
[23], we call networks with such topologies BGI networks. 



source 




Fig. 1. Example of a BGI newtork 



Lo 



Li 



L2 



Assuming that the source transmits the broadcast message in round 0 and 
remains silent thereafter, one can use a hitting set argument [3], to show that 
any deterministic protocol requires 17 (n) rounds to complete broadcast in at 
least one of such networks. 

However, as observed in [23], the source can learn some information about 
the topology during the execution of the protocol. This information can then be 
transmitted to the other nodes in the network to achieve faster broadcasting. In 
fact, [23] provides a protocol that completes broadcast in 0{lg2n) on any BGI 
network. 

Our main idea is to estimate the amount of information that can be learned 
by the source. Then, we show that even if all this information were available to 
the neighbors of the source at the beginning of the execution, and the source 
remains silent, still broadcast can not be achieved in less than ^/n rounds. 

For this purpose, we consider a class of networks that can be partitioned 
into components Ci, . . . ,Ck, where each of the components has the structure of 
a BGI network without the source, which is the same for all components (see 
Figure 2 for an example). 

The first important observation is that all the information learned by the 
source during the execution of a protocol can be summarized by topological 
information about the components. More specifically, in each round i, if the 
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source receives a message from a node u, then we may assume that the message 
contains a complete description of the topology of the component of u. In the 
other case, either no neighbor of the source transmits or a collision occurs during 
round i, and the source learns nothing in the round, which we denote by <f>. These 
ideas are formalized in section 5. 

From this observation, it is tempting to conclude that the partial knowledge 
collected by the source makes broadcasting easy in some components, but is 
uninformative with respect to the rest of the network. If this were the case, then 
one could use a hitting set [3] or a selective family [9] argument to obtain lower 
bounds for completing broadcast. 

However, as we show in the next example, depending on the specification of 
the protocol, the information provided by the source may allow the nodes to 
obtain additional information about the topology of the network. 

Consider a protocol according to which exactly two nodes, say u and v, are 
scheduled to make a transmission to the source in a specific round, but only if 
each of them is connected to some other node, say w and z, respectively. Now, 
if the source receives (j) in this round it cannot distinguish the case where both 
u and V transmit from the case in which none of them transmit. On the other 
hand, with the information that the source receives 4> in this round, node u can 
decide if edge (u, z) is present or not based on its own behavior during the round. 

To avoid this type of problem we provide an information theoretic argument 
that is independent of how the information provided by the source is interpreted 
by the other nodes. Specifically, we first obtain an upper bound on the number of 
bits required to encode all the information learned by the source in the first ^/n 
rounds of an arbitrary protocol. Then, we show that even if all the neighbors 
of the source receive the same arbitrary string of at most that size, and the 
source remains quiet in every round, still i/n rounds are necessary to complete 
broadcast. 

The rest of the paper proves these facts, and is organized as follows. Section 
3 describes the adopted model in more detail and introduces some definitions. 
Section 4 formally states the facts above in a few lemmas, and gives the proof 
of our main result. Sections 5, ?? and Appendix ?? provide proofs for all the 
intermediate results. Finally, we conclude in Section 6 



3 Description of the Model and General Definitions 

Definition [3], [23]: A broadcast protocol tt for a radio network is a multipro- 
cessor protocol which proceeds in rounds as follows: 

1. Nodes have distinct labels from the set {0,1,..., m}, where m is a polynomial 
in the number of nodes in the network. A distinguished node with label 0 is 
called the source. 

2. All nodes execute identical copies of the same protocol tt. 

3. In each round, every node acts either as a transmitter or as a receiver (or is 
inactive) . 




538 



C. Brito, E. Gafni, and S. Vaya 



4. A node receives a message in a specific round if and only if it acts as a receiver 
and exactly one of its neighbors transmits in that round. We assume that 
the messages are authenticated, that is, when a node receives a message it 
knows the label of the transmitting node. 

5. The action of a node in a specific round is determined by 

a) Initial input, which contains its own label and labels of its neighbors. 

b) Messages received in previous rounds. 

6. In round 0 only the source transmits a broadcast message. 

7. Only nodes that have received a message are allowed to transmit. That is, 
the only ’’spontaneous” transmissions is the one by the source in round 0. 

8. Broadcast is completed in r rounds if all nodes receive the source message 
in one of the rounds 0, 1, . . . , r — 1. 

3.1 C 2 Networks 

The class of networks C 2 is characterized by the fact that nodes can be partitioned 
into layers satisfying: 

1. Layer Lg consists only of the source node; 

2. Layer L\ contains an nodes, where c is a constant with respect to n. Each 
node in layer L\ is only connect to the source and at most one node from 
layer L 2 - 

3. Layer L 2 contains dy/n nodes, where d is a constant with respect n and 
d < c. Each node in layer L 2 is connected to exactly ^/n nodes in Li , and it 
is not connected to the source. 

For each node u from L 2 in network Af, we denote the set composed by all 
the nodes from Li connected to u by Clanj\f{u), and say that u is the head of 
this Clan. A node in Li that is not connected to any node in L 2 is said to be 
orphan. Figure 2 shows an example of a C 2 network. 



3.2 Advice String Y — Y{7r,Af,t) 

For a broadcast protocol tt, a network Af G C 2 , and integer t>0,Y = T{n,Af, t) 
is a string that encodes the topological knowledge (potentially) acquired by the 
source during the first t rounds of execution of protocol tt on network Af. Viewing 
T as an array with t elements, T[i] represents the information obtained by the 
source during round i, 0 < i < t, and is given by: 

1. If the source does not receive any message during round i (either because 
every node in Li was silent, or because a collision occurs), the T = <f>] 

2. In the other case, let w be the node in Li that transmits the message received 
by the source. Then, T\i] contains a pair consisting of: (a) a list with the 
labels of all nodes in the same clan as w] and, (b) the label of the clan of w. 
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Fig. 2. Example of a C2 network 



Note that when T\i] ^ <j) it encodes all the information about the connectivity 
among the nodes in the Clan of w. 

Let IK_ 4 (T) denote the Kolmogorov complexity of string T with respect to a 
decompression algorithm A. Define, 



Kc (t) = min max K_4(T(7 t, Af, t)) 

A ir.Af 

4 Lower Bound 

The lower bound result stated in section 2 is implied by the following theorem: 

Theorem 2. For every deterministic protocol that runs for at most y/n rounds, 
there exists a network Af G C 2 such that some head node in Af does not receive 
the source message, and so broadcast is not completed. 

Before proving the theorem, we state two important lemmas developed in 
this work and a variation of a theorem from [12], that will consist of the main 
steps of the argument. 

The first lemma shows that it is sufficient to analyze protocols that have 
a simple communication structure, and in which the source only transmits the 
broadcast message together with an advice string T of bounded size, in round 
0. This lemma captures the idea that all the help that could be provided by 
the source to achieve faster broadcast can be encoded in a string containing 
connectivity information. 

Lemma 1. If there exists a protocol tt which completes broadcast in at most 
r rounds on a network Af G C 2 , then there exists a protocol n' that completes 
broadcast in at most 3r on Af, satisfying: 
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— For i = 0, 1,2, nodes in layer Li transmit in round t only ift = i{mod3). 

— In round 0, the source transmits a message containing the broadcast mes- 
sage and a string of length at most In every other round the source 

remains silent. 

The proof of lemma 1 is presented in section 5, and consists of the construc- 
tion of protocol 7t' through a series of reductions. 

The next lemma provides a lower bound on the size of a set of networks 
with some special properties. The lemma is proved in section ?? by a counting 
argument, using estimates on the number of networks that share the same advice 
string T, and the number of distinct clans that appear in at least one of these 
networks. 

Lemma 2. Let tt he an arbitrary protocol and t > 0. Then, there exists a subset 
of networks X C C 2 and a head node v € L 2 such that: 

1. For all Afi,Afj € X, T{Tr,Afi, y/n) = V^) ■ 

2. The clans with which v is associated in each network in X are distinct; 

3 - lg2 \^\ > Ti\/«lg2(\/«+ 1) 

Finally, we will also make use of the next definition and theorem in our 
proof. They generalize the concept of selective families and the corresponding 
lower bound given in [12], for the case in which we consider arbitrary sets A of 
subsets of {1, . . . , n}. The proof of this theorem is given in Appendix A. 

Definition 1. Let n, k be arbitrary positive integers with k < n, and let A he a 
set of subsets o/{l, . . . ,n}. A family T of subsets of {1, ... ,n} is k-selective for 
A if for every 6 € A with |5| < k, there is a subset F G if such that |(5nF| = 1. 

Theorem 3. Let A he a set of ^/n-subsets o/{l, . . . ,n}. If F is a ^/n-selective 
family for A, then 



, , 1 1 11 16n 

\T\ > lg2 |A| - — v^lg2 
12 

Proof of Theorem 2: The proof is by way of contradiction. We shall assume 
that the broadcast protocol runs for t < ^Jn rounds. However, we shall show 
that t should actually be greater then for broadcast to be complete 

in all networks. 

Suppose that there exists a protocol tt that completes broadcast in every 
network from C 2 in t < ^/n rounds. Then, let tt' be a protocol satisfying the 
conditions of lemma 1 that completes broadcast in every such network in at 
most S-^n rounds. 

Now, let X CC 2 and u G L 2 he a, subset of networks and the corresponding 
head node given by lemma 2 for protocol tt. 

For a given network Af G X and a round i, let subset r]i^j^{u) be defined as 
follows: 
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— If node u does not receive the broadcast message before round i when tt 

is executed on M, then is the subset of nodes from Clanj^{u) that 

would transmit in round f; 

— Otherwise, T]i^j^{u) = (f>. 

Let r]i{u) = [J ?7jX- 

Afe-v 

Proposition 1. Node u receives the broadcast message up to round 3t in net- 
work Af G X when protocol tt' is run on Af only if 



\Clanjij-{u) n 773j_|_i(w)| = 1, for some j < t. 

The proof of this proposition easily follows by induction, by the observation that 
the behavior of a node v G Clanjg-{u) is independent of the topology of Af up to 
the first transmission of u. 

Since tt' completes broadcasting in 3^/n rounds, it follows that for allAfGX 
there exists a j < ^/n such that \Clan_fy/{u) n? 73 j+i(M)| = 1. But this implies that 
the set r = {? 73 j_|_i(m) : j = 0, . . . ,y/n} (where \r\ = 3^/n) is a -y/n-selective 
family for the set A = {Clan_\f{u) : Af G X}. 

Theorem 3, however requires the following relation must hold between T 
and A, 



|T| > lg2 |Z\| - ^V^^lg2 ^ (1) 

By lemma 2, we have that each network Af G X \s associated with a 
distinct clan. So |Z\| = \X\. And by clause (3) of lemma 2 we have lg 2 |T| 
> uVnlg2iVn + l) 

Plugging in the values in equation (1) we have, 

\r\ > lg2 Z\ - —Vnlg2 —!= 

12 yn 

13 , I— 11 16n 

> J^V^^lg2(V^+ 1) - 



for large values of n. 

Contradiction, as |T| < 3yjn. Hence, there exists no deterministic protocol 
that completes broadcast in at most ^Jn rounds in every C 2 network. □ 

Note: The combinatorial argument in Lemma 2 can be circumvented by 
an oracle argument as follows. The advice string provided to nodes in L\ is 
composed of i/n blocks. Each of these blocks is either <j) or the connectivity 
information of a clan. Instead of treating (f as one bit of information one can use 
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an oracle argument and substitute (j) with the connectivity information of two 
clans, chosen arbitrarily from the set of clans whose member nodes transmitted 
in the corresponding round. One may now argue that if the advice string has 
just o{y/n) blocks then a large number of clans get no help from the source. A 
Selective family or Hitting set argument can now be used to prove that Q{y/n) 
rounds are still necessary for completion of broadcasting in the residual network. 
We note here a weakness of the oracle based argument: For a more general model 
when (j)i {(f >2 is received when two nodes transmit) is received by a node when 
i of its neighbors transmit in the same round, the substitution of tpi by the 
connectivity information of i clans, whose members transmitted in that round, 
provides too much information to the network to argue the lower bound. 



5 Reductions (Proof of Lemma 1) 

Fix a network Af € C 2 . First, we introduce some notation. Let tt be a determin- 
istic broadcast protocol, let ic be a node in the network, and let t > 0. Then, we 
define 

~ H(7r,w,t) maintains the communication history of node w up to round t, 
in the form of a list of pairs corresponding to the messages transmitted and 
received by w in each round. 

Note that, from the model description in Section 3, at most one of the ele- 
ments in each pair of El(7r,w, t) can be different from <f>. Morever, the behavior 
of node w during round t -|- 1 is completely determined by its inputs and the 
contents of EI(7r,w,t). 



5.1 Reduction 1 

This reduction shows that we may consider protocols with a simplified com- 
munication structure, at the cost of a constant factor increase in the number 
of rounds to complete broadcast. In these protocols, the nodes coordinate their 
transmissions such that collisions involving nodes from different layers do not 
occur. 

Claim. Assume there exists a protocol tt that completes broadcast in at most r 
rounds. Then, there exists a protocol tt' that completes broadcast in at most 3r 
rounds, such that, for i = 0, 1,2, nodes in layer Li transmit in round t only if 
t = i{mod3>). 

Proof. Protocol tt' simulates each round t of protocol tt with a sequence of 3 
rounds: 3t, 3t -|- 1, 3t -I- 2. The idea is that, in round 3t + i, each node in layer Li 
takes the same action that it would take in round t under protocol tt. 

Assuming that tt' has a description of protocol tt, it is sufficient to show that 
each node w can compute the contents of H(7t, w,t) for all 0 < t < r. This can 
be proved by induction as follows. 
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Suppose that at the beginning of round t node w has the correct history of 
the communication under protocol tt up to this point. This certainly holds for 
t = 0. Now, assuming that this holds for round t — 1, it is easy to see that it will 
hold for round t if the following rule is applied: 

— If w transmits message in one of the rounds 3t, 3t + 1, 3t + 2, then append 
the pair (/x, (/)) to the history list. 

— If rc does not transmit and exactly one message /i is received in rounds 
3t, 3t + 1, 3t + 2, then append the pair {(j), y) to the history list. 

— In any other case, append the pair ((/), </>) to the history list. 

Clearly, tt' completes broadcast in 3r rounds if tt completes broadcast in r 
rounds. 

We denote by ili the class of protocols satisfying the conditions of claim 5.1. 



5.2 Reduction 2 

This reduction shows that instead of transmitting arbitrary messages the source 
can just send topological information. The idea is that, receiving the labels and 
connectivity of the nodes in a component, the remaining nodes can recover the 
original messages by simulation. 

Claim. Assume there exists a protocol tt from class ili that completes broad- 
casting in at most 3r rounds. Then, there exists protocol tt' such that 

— the source is provided with Advice String 3r) as input; 

— for each 0 < t < r, the source transmits T[3t — 2] in round 3t. 

that completes broadcasting in at most 3r rounds. 

Proof. (Sketch) Clearly, it is enough to construct a protocol tt' in which the 
source behaves as described in the claim, and in each round the nodes in layers 
Li, L2 transmit exactly the same messages that they would send under protocol 
TT. In fact, since the nodes in L2 are not connected to the source, it is sufficient 
to show that the nodes in Li can behave as specified (nodes in L2 just simulate 
protocol tt). 

The argument that follows is based on a simple observation. Let S denote a 
subset of the nodes in network Af. If a node w has the description of protocol tt, 
knows the topology of S (labels and connectivity), and knows all the messages 
received by a node in S from a node in A/” — S' up to round t, then w can simulate 
the behavior of the nodes in S to [recover/compute] all the messages that each 
of these nodes would send under protocol tt up to round t -I- 1. 

Assume first that all the messages sent by the source up to round 3(t — 1) 
are 4>. This means that the source does not receive any message up to round 
3(t— 1) — 1 under protocol tt. Thus, each node w in L\ can simulate the behavior 
of the source under protocol tt to recover all the messages that the source would 
send up to round 3(t — 1). Based on this information, node w just simulates its 
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own behavior under protocol tt to compute which message to send (if any) in 
round 3(t — 1) + 1. 

Now, assume that in round 3t, for the first time, the source transmits a mes- 
sage different from (j), consisting of the description of the topology of component 
Ci- Suppose first that Ci is composed by a single orphan node v. Let w be any 
other node in Li. Recall that w knows all the messages sent by the source up 
to round 3(t — 1) under protocol tt. So, when w receives the label of v in round 
3t, it can simulate the behavior of node v under protocol to recover the message 
received by the source in round 3(t — 1) -I- 1. With this information, w simulates 
the behavior of the source to recover the message sent by it in round 3t, and 
finally simulates its own behavior under protocol tt to compute the message it 
should in round 3t -|- 1 . 

Now, suppose that component Ci consists of head node u and the nodes in 
Clanj^{u). Then, again, when a node in layer Li receives the description of the 
topology of Ci, it can simulate the behavior of each node in Ci to recover the 
message received by the source in round 3(t — 1) -I- 1, then simulate the behavior 
of the source up to round 3t, and finally simulate the behavior of w to compute 
which message to send in round 3t -I- 1. 

An easy inductive argument completes the proof of the claim. 

We denote by II 2 the class of protocols satifying the properties stated in 
claim 5.2. 

5.3 Reduction 3 

The following easy reduction completes the proof of lemma 1 . 

Claim. Assume there exists a protocol tt from class II 2 that completes broad- 
casting in at most 3r rounds. Then, there exists a protocol tt' such that 

— In round 0 the source transmits the broadcast message and Advice String 
T{tt,N, 3r); 

— Source remains silent in every round t > 0. 
that completes broadcasting at most 3r rounds. 

6 Conclusions 

We proved a lower bound of f2{y/n) rounds for a deterministic protocol to com- 
plete broadcasting in radius 2 network. The outstanding open problem resulting 
from this work is whether this lower bound can be further improved. Kowalski- 
Pelc [4], conjecture that the lower bound proved here is optimal. 

The technique of quantifying the amount of information learned by a node 
in the first r rounds of an arbitrary protocol, is of independent interest. To the 
best of our knowledge, this idea has not been explored in the context of radio 
networks. We believe that it could be useful to prove stronger results to bridge 
the gap between the lower and upper bounds on the broadcasting problem. It 
could also be applied on more general networks and other problems in radio 
networks like gossiping. 
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Abstract. In this work, we introduce and study a new model for selfish 
routing over non-cooperative networks that combines features from the 
two such best studied models, namely the KP model and the Wardrop 
model in an interesting way. 

We consider a set of n users, each using a mixed strategy to ship its 
unsplittable traffic over a network consisting of m parallel links. In a 
Nash equilibrium, no user can increase its Individual Cost by unilaterally 
deviating from its strategy. To evaluate the performance of such Nash 
equilibria, we introduce Quadratic Social Cost as a certain sum of In- 
dividual Costs - namely, the sum of the expectations of the squares of 
the incurred link latencies. This definition is unlike the KP model, where 
Maximum Social Cost has been defined as the maximum of Individual 
Costs. 

We analyse the impact of our modeling assumptions on the computation 
of Quadratic Social Cost, on the structure of worst-case Nash equilibria, 
and on bounds on the Quadratic Coordination Ratio. 



1 Introduction 

1.1 Motivation and Framework 

Nash Equilibria and Outline. Nash equilibrium [23,24] is arguably the most 
robust equilibrium concept in (non-cooperative) Game Theory.^ At a Nash equi- 
librium, no player of a strategic game can unilaterally improve its objective by 
switching to a different strategy. In a pure Nash equilibrium, each player chooses 
exactly one strategy (with probability one); in a mixed Nash equilibrium, the 
choices of each player are modeled by a probability distribution over strategies. 
Of special interest to our work is the fully mixed Nash equilibrium [22], where 
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and by research funds at University of Cyprus. 
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^ See [25] for a concise introduction to contemporary Game Theory. 
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each user chooses each strategy with non-zero probability. Nash equilibria have 
some very nice properties; most notably, for finite games, there always exists a 
mixed Nash equilibrium [24]. 

In this work, we embark on a systematic study, within a new model for selfish 
routing over non-cooperative networks that we introduce, of some interesting 
algorithmic and mathematical properties of Nash equilibria for some specific 
routing game formulated in this context. Our new model for selfish routing is an 
interesting hybridization of the two most famous models for selfish routing that 
were studied in the literature before; these are the so called KP model [19] and 
Wardrop model [4,30]. 

The KP Model and the Wardrop Model. The KP and the Wardrop models 
differ with respect to the assumptions they are making about: 1. the structure of 
the routing network, 2. the splittability or unsplittability of the users’ traffics; 3. 
the definition of Individual Cost for a user they use for defining Nash equilibria; 
4. the type of Nash equilibria (pure or mixed) they consider; 5. the specific defi- 
nitions they employ for Social Cost, a performance measure for Nash equilibria, 
and for Social Optimum, an optimality measure for traffic assignments (not nec- 
essarily equilibria). The definitions for Social Cost usually relate to Individual 
Costs. In either model, these two definitions give rise to Coordination Ratio, the 
maximum value of the ratio of Social Cost over Social Optimum; a worst-case 
Nash equilibrium is one that maximizes its particular Social Cost. 

In the KP model, a collection of n users is assumed; each user employs a mixed 
strategy, which is a probability distribution over m parallel links, to control the 
shipping of its own assigned traffic. In the KP model, traffics are unsplittable. For 
each link, a capacity specifies the rate at which the link processes traffic. Allowing 
link capacities to vary arbitrarily gives rise to the standard model of related links. 
A special case of the model of related links is the model of identical links, where 
all link capacities are identical. Reciprocally, in the model of identical users, 
all user traffics are equal; they may vary arbitrarily in the model of arbitrary 
users. In a Nash equilibrium, each user selfishly routes its traffic on those links 
that minimize its Individual Cost, its expected latency cost on that link, given 
the network congestion caused by the other users. In the KP model, the Social 
Cost of a Nash Equilibrium, henceforth called Maximum Social Cost, is the 
expectation, over all random choices of the users, of the maximum, over all links, 
latency through a link; the Social Optimum, henceforth called the Maximum 
Social Optimum, is the least possible maximum, over all links, latency through a 
link that could be attained had global regulation been available; correspondingly, 
the Coordination Ratio in the KP model will henceforth be called the Maximum 
Coordination Ratio. It follows that the Maximum Social Cost in the KP model 
is the maximum of Individual Costs. 

In the Wardrop model [11,30], there have been considered arbitrary networks 
with latency functions for edges. Moreover, the traffics are splittable into arbi- 
trary pieces. Here, unregulated traffic is modeled as a network flow. Equilibrium 
flows have been classified as flows with all flow paths used between a given pair 
of a source and a destination having the same latency. Equilibrium flows are 
optimal solutions to a convex program, in case the edge latency functions are 
convex. An equilibrium in this model can be interpreted as a Nash equilibrium 
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in a game with infinitely many users, each carrying an infinitesimal amount of 
traffic from a source to a destination. Thus, the Wardrop model restricts to pure 
Nash equilibria. The Individual Cost of a user is defined as the sum of the edge 
latencies on a path from the user’s source to its destination. The Social Cost of 
a Nash equilibrium is the sum of all Individual Costs. The Social Optimum is 
the least possible, over all network flows, sum of Individual Costs. 

The New Model for Selfish Routing. Our new model for selfish routing 
over non-cooperative networks is a hybridization of the KP model [19] and the 
Wardrop model [11,30]. More specifically, we follow the KP model to consider 
the simple parallel links network (which, however, is also a special case for the 
Wardrop model). We also follow the KP model to consider unsplittable traffics 
and mixed Nash equilibria. The Individual Cost we adopt is also identical to 
that adopted in the KP model - the expected latency cost on a link. However, 
we follow the Wardrop model to model Social Cost as a certain sum of Indi- 
vidual Costs, which we will later describe. In some sense, our new model is the 
Wardrop model restricted to the simple parallel links network but modified to 
allow for unsplittable traffics and for mixed strategies; these two features were 
borrowed from the KP model. Our work is the first step toward accommodating 
unsplittable traffics within the Wardrop model. 

For any link, consider the square of the traffic through the link divided by the 
link capacity; taking the expectation of this and adding up over all links yields 
the Social Cost for our model. Call it Quadratic Social Cost. In correspondence 
with Quadratic Social Cost, we also define and study in our new model Quadratic 
Optimum and Quadratic Coordination Ratio. Naturally, the former is the least 
possible sum of the squares of total traffic through a link divided by the link 
capacity; the latter is the maximum value of the ratio of Quadratic Social Cost 
over Quadratic Social Optimum. Since Nash equilibria are defined with respect 
to Individual Costs (but are independent of Social Cost), the Nash equilibria 
in our new model coincide with those in the KP-model since the two adopt the 
same Individual Costs. 

Note that the commutativity between expectation and sum in the defini- 
tion of Quadratic Social Cost has been unavailable (between expectation and 
maximum) in the definition of Maximum Social Cost for the KP model. So, this 
commutativity allows hopes for some more tractable analysis of several problems 
regarding some interesting algorithmic, combinatorial, structural and optimality 
properties of Nash equilibria in the new model. 



1.2 Contribution and Significance 

We partition our results into three major groups. 

Combinatorial Expressions for Quadratic Social Cost. In the most gen- 
eral model of arbitrary users and arbitrary links, we obtain an elegant, recursive 
combinatorial formula for Quadratic Social Cost, implying a dynamic program- 
ming algorithm to compute Quadratic Social Cost. Furthermore, we derive sim- 
ple, combinatorial expressions for the Quadratic Social Cost of the fully mixed 
Nash equilibrium in case of arbitrary users and identical links, and identical 
users and arbitrary links, respectively. 
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The Worst-case Nash Equilibrium. A natural problem that arises in the 
context of Quadratic Social Cost is to identify the worst-case Nash equilibrium - 
the one that maximizes, for each specific choice of user traffics and link capacities, 
the Quadratic Social Cost. We address this problem in the particular setting of 
the model of identical users and identical links, where the fully mixed Nash 
equilibrium always exists. We prove that, in this particular setting, the worst- 
case Nash equilibrium is the fully mixed Nash equilibrium. 

Bounds on Quadratic Coordination Ratio. For the model of arbitrary users 
and identical links we prove that the Quadratic Coordination Ratio for pure Nash 
equilibria is precisely |. In case of identical users and related links, we discover 
that the Quadratic Coordination Ratio for pure Nash equilibria increases slightly 
to |. We next turn to the model of arbitrary users and identical links. Here, we 
restrict ourselves to the fully mixed Nash equilibrium. For this setting, we prove 
an upper bound of 2 — ^ on Quadratic Coordination Ratio. For identical users 
the Quadratic Social Cost of the fully mixed Nash equilibrium slightly drops 
to 1 -I- min{^2^, times the optimal Quadratic Social Cost. Since in this 
setting the fully mixed Nash equilibrium is the worst-case Nash equilibrium, this 
bound holds for the Quadratic Coordination Ratio. 

1.3 Related Work and Comparison 

The KP model was first introduced in the work of Koutsoupias and Papadim- 
itriou [19]; it was further studied in [9,10,12,14,16,17,18,21,22]. Fully mixed Nash 
equilibria were introduced and analyzed in [22]. Bounds on Maximum Coordi- 
nation Ratio were proved in [9,14,18,22]. The works by Fotakis et al. [16], by 
Gairing et al. [17], and by Lucking et al. [21] dwelved into the combinatorial 
structure and the computational complexity of Nash equilibria for the KP model. 
In particular, the Fully Mixed Nash Equilibrium Conjecture was motivated by 
some results in [16], explicitly formulated in [17] and further studied in [21]. The 
Wardrop model was defined in [30] and further studied in [3,4,11]. Recent studies 
of selfish routing within the Wardrop model include [27,28,29]. 

Fotakis et al. [16, Theorem 8] proved that computing the Maximum Social 
Cost of an arbitrary Nash equilibrium is a ^P-complete problem. This hardness 
result stands in very sharp contrast to our general, pseudopolynomial algorithm 
to compute Quadratic Social Cost (Theorem 1). 

For the KP model, there are known bounds on Maximum Coordination Ratio 
of 0 ^ ^ for the model of arbitrary users and identical links [9,18,19,22], 

of 0 ^ ig igi^m ) model of arbitrary users and related links [9], and of 

O (-\/w) for the model of arbitrary users and related links and for pure Nash 
equilibria [14], which improves the previous bound for small values of m. Some 
of these super-constant bounds stand in very sharp contrast to some of the 
constant bounds (independent of m and n) on Quadratic Coordination Ratio 
we prove in this work. However, for the Wardrop model, there have been shown 
constant bounds on Coordination Ratio [27,28,29]. 

Other works that have studied Coordination Ratio include [13] for a network 
creation game and [2] for a network design game. For a survey of recent work 




A New Model for Selfish Routing 551 



on selfish routing in non-cooperative networks, see [15]. Work in the scheduling 
literature that has considered quadratic cost functions for makespan includes [1, 
6,8,20]; work in the networking literature that has considered quadratic cost 
functions for network delay includes [7]. 

1.4 Road Map 

The rest of this paper is organized as follows. Section 2 presents our definitions 
and some preliminaries. The Quadratic Social Cost of Nash equilibria is studied 
in Section 3. Section 4 proves that the fully mixed Nash equilibrium maximizes 
Quadratic Social Cost in the model of identical users and identical links. Our 
bounds on Quadratic Coordination Ratio are presented in Section 5. Due to lack 
of space the proofs are omitted. They can be found in the full version. 

2 Framework 

2.1 Mathematical Preliminaries and Notation 

Throughout, denote for any integer m > 2, [m] = {1,... ,m}. For a random 
variable X, denote S{X) the expectation of X. We continue to prove a simple 
combinatorial inequality. 

Lemma 1. For any fc, a, 6 G N with 0 < k < a < b, ;^ (^) < ^ (^) . 

Finally, we prove a combinatorial lemma that will be useful in a later proof. 

Lemma 2. Fix any real number a, where 0 < a < 1, and positive integer r,and 
setA=l. Then, Er<,<. {l)k^ (i)'= (l - ^ ^ 



2.2 General 

We consider a network consisting of a set of m parallel links 1,2,... , m from a 
source node to a destination node. Each of n network users 1,2,... , n, or users 
for short, wishes to route a particular amount of traffic along a (non- fixed) link 
from source to destination. (Throughout, we will be using subscripts for users 
and superscripts for links.) 

Denote Wi the traffic of user i € [n]. Define the n x 1 traffic vector w in the 
natural way. Assume throughout that m > 1 and n > 1. Assume also, without 
loss of generality, that wi > W 2 > ■ ■ ■ > w„- Denote W = X)ie[n] 

A pure strategy for user i G [n] is some specific link. A mixed strategy for user 
i G [n] is a probability distribution over pure strategies; thus, a mixed strategy 
is a probability distribution over the set of links. The support of the mixed 
strategy for user i G [n], denoted sup(i), is the set of those pure strategies (links) 
to which i assigns positive probability. A pure strategy profile is represented by 
an n-tuple • • ■ ,^n) G [m]”; a mixed strategy profile is represented by an 

n X m probability matrix P of nm probabilities pi, i € [n] and j G [m], where pi 
is the probability that user i chooses link j. 
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For a probability matrix P, define indicator variables If G {0, 1}, where 
i G [n] and j G [m], such that // = 1 if and only if pf >0. Thus, the support of 
the mixed strategy for user i G [n] is the set {j G [m] \ If = 1}. For each link 
j G [m], define the view of link j, denoted view{j), as the set of users i G [n] 
that potentially assign their traffics to link j; so, view{j) = {i £ [n] | If = 1}. 
For each link j G [m], denote = \view{j)\. 

A mixed strategy profile P is fully mixed [22, Section 2.2] if for all users i G [n] 
and links j G [m]. If = 1. Throughout, we will cast a pure strategy profile as a 
special case of a mixed strategy profile in which all (mixed) strategies are pure. 



2.3 System, Models, and Cost Measures 

Denote > 0 the eapacity of link £ G [m] , representing the rate at which the link 
processes traffic. So, the lateney for traffic w through link £ equals w/c^. In the 
model of uniform eapaeities, all link capacities are equal to c, for some constant 
c > 0; link capacities may vary arbitrarily in the model of arbitrary capacities. 
Assume throughout, without loss of generality, that c^ >(?>...> c™ . Denote 
C = ^ model of identical traffics, all user traffics are equal to 1; 

user traffics may vary arbitrarily in the model of arbitrary traffics. 

For a pure strategy profile • ■ • ,^n), the latency cost for user i G [n], 

denoted A^, is that is, the latency cost for user i is the latency 

of the link it chooses. For a mixed strategy profile P, denote 6^ the aetual traffic 
on link £ G [m]; so, is a random variable. For each link £ G [m], denote 9^ the 
expected traffic on link £ G [mj; thus, 9^ = £{S^) = Given P, define 

the TO X 1 expected traffic vector 0 induced by P in the natural way. Given P, 
denote the expected lateney on link I G [to]; clearly, = ^. Define the 
TO X 1 expected lateney vector A in the natural way. For a mixed strategy profile 
P, the expected latency cost for user i G [n] on link £ G [to], denoted Xf, is the 
expectation, over all random choices of the remaining users, of the latency cost 
for user i had its traffic been assigned to link £; thus. 



^ Wi + Y.k=l,k^^Pl^k 



{l-pf)wj + 9^ 



For each user i £ [n], the minimum expected latency cost, denoted Xi, is the 
minimum, over all links £ £ [to], of the expected latency cost for user i on link 
£] thus, Xi = min^g[„^] Xf. For a probability matrix P, define the n x 1 minimum 
expected latency cost vector X induced by P in the natural way. 

Associated with a traffic vector w, a capacity vector c and a mixed strategy 
profile P is the Quadratic Social Cost, denoted QSC(w,c, P), which is the ex- 
pectation of the sum of squares of the incurred link latencies; thus. 



QSC(w,c,P) = £l y: 









{h,£2,--- iG[m] 
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Since the expectation of a sum is equal to the sum of expectations, we can write 



Qsc(w,c,p)=^ 






( 1 ) 



£^[m] A(Z[n] \i^A 






The Quadratic Optimum associated with a traffic vector w and a capacity vector 
c, denoted QOPT(w, c), is the least possible sum of squares of the incurred link 
latencies. Note that while QSC(w, c, P) is defined in relation to a mixed strategy 
profile P, QOPT(w,c) refers to the optimum pure strategy profile. 

The Maximum Social Cost, denoted MSC(w, c, P), which is used in the orig- 
inal KP model, is defined as the expectation of the maximum of the incurred 
link latencies. Correspondingly, the Maximum Optimum, denoted MOPT(w, c), 
is the minimum, over all assignments, maximum incurred link latency. 



2.4 Nash Equilibria 

We are interested in a special class of mixed strategies called Nash equilib- 
ria [24] that we describe below. Formally, the probability matrix P is a Nash 
equilibrium [19, Section 2] if for all users i G [n] and links £ G [m], Af = Aj if 
/f = 1, and Af > A^ if if = 0. Thus, each user assigns its traffic with positive 
probability only on links (possibly more than one of them) for which its expected 
latency cost is minimized; this implies that there is no incentive for a user to 
unilaterally deviate from its mixed strategy in order to avoid links on which its 
expected latency cost is higher than necessary. 

Mavronicolas and Spirakis [22, Lemma 15] show that in the model of arbi- 
trary users and identical links, all links are equiprobable in a fully mixed Nash 
equilibrium. 

Lemma 3 (Mavronicolas and Spirakis [22]). Consider the model of arbi- 
trary users and identical links. Then, there exists a unique fully mixed Nash 
equilibrium with associated Nash probabilities pf = 1/m, for any user i G [n] and 
link £ G [m] . 



2.5 Coordination Ratio and Quadratic Coordination Ratio 



The Quadratic Coordination Ratio is the maximum value, over all traffic vec- 
tors w, capacity vectors c, and Nash equilibria P of the ratio In a 

corresponding way, the Maximum Coordination Ratio is defined in [19] as the 
maximum value, over all traffic vectors w, capacity vectors c and Nash equilibria 
P of the ratio 



3 The Quadratic Social Cost of Nash Equilibria 

In this section, we study the Quadratic Social Cost of arbitrary (mixed) Nash 
equilibria. We start by proving: 
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Theorem 1 (Quadratic Social Cost of Arbitrary Nash Equilibrium). 

Fix any traffic vector w, any capacity vector c, and any arbitrary Nash equilib- 
rium P. Then, QSC (w, c, P) can be computed in time OfnmW). 

We next establish that the Quadratic Social Cost takes a particularly nice 
form for the case of the fully mixed Nash equilibrium. We prove: 

Theorem 2 (Quadratic Social Cost of Fully Mixed Nash Equilibrium). 

Consider the model of arbitrary users and identical links. Then, for any traffic 
vector w, 



QSC (w, F) = Wi + - W2 , 

m 

where Wi = wf and IV 2 = E^,ke[n],^^k 

The next Lemma is used in the proof of Proposition 1. 

Lemma 4. Let a,n G N, a even, let pi G [0,1] for all \ < i < n. Denote 
P = {pi, ... ,Pn) andp = Ei<i<nPi- ‘5'ei 

H{p)= Y, lArjnp^l 

.4c[n] lieA J J 

Define P by Pi = ^ ■ p for all 1 < i < n. Then H (P) < H (P) . 



Proposition 1. Consider the model of identical users and identical links. Then, 
for any arbitrary Nash equilibrium P, 



QSC(P)< ^ 

je[m] 





where 9^ = Eie[n]Pl ^md rj = |mew;(j)|. 

We hnally prove: 

Theorem 3. Consider the model of identical users and related links. Then, 

n (n + m — 1) 



QSC(c,F) = 



C 



Corollary 1 (Quadratic Social Cost of Fully Mixed Nash Equilibrium). 

Consider the model of identical users and identical links. Then, 

n (n + m — 1) 



QSC (w, c, F) 



m 
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4 The Worst-Case Nash Equilibrium 

We now establish that, for the model of identical users and identical links, the 
worst-case Nash equilibrium is the fully mixed Nash equilibrium. We start our 
proof with a technical lemma which holds in the more general model of arbitrary 
users, and then return to the model of identical users and identical links. 

Lemma 5. Consider n arbitrary users on m identical links, and let j, k G [m], 

1. If view{j) = view{k) ^ 0, then |mew;(j)| = 1 or = 0^ and pj = p^ for all 
i G [n]. 

2. If view{j) C view{k), then 6^ > 9^ . 



Theorem 4. Consider the model of identical users and identical links. Then, 
for any arbitrary Nash equilibrium P, QSC (w, c, P) < QSC (w, c, F) . 

5 Bounds on Quadratic Coordination Ratio 

In this section, we present our bounds on Quadratic Coordination Ratio. We 
start by proving: 

Theorem 5 (Quadratic Coordination Ratio for Pure Nash Equilibria). 

Consider the model of arbitrary users and identical links, restricted to pure Nash 
equilibria. Then, 



QSC(w,P) 9 

max , — ^ = - . 

w,p QOPT (w) 8 

We give here only a sketch of the proof. Let there be n users and m links. If 
n < m, then every pure Nash equilibrium has optimal social cost. Now assume 
n > m. Let P be any pure Nash equilibrium. Let us first assume that Wi < ^ 
holds for all users i G [n]. Let B = min^gj^j be the minimum traffic on any 
of the links. Then B > 0, and it has been shown in [17], that on every link the 
load is bounded by 2B. 

We use some iterative procedure to compute an upper bound for QSC(w, c, P). 
When the algorithm terminates, then we know that there exists some k G [m], 
such that 

— QSC(w,c,P) < 

— Xj = 2B for k links, 

— Xj = B + x,0<x<B, for one link, and 

— Xj = B for m — k — 1 links. 

Note, that QOPT(w,c) > and therefore 

QSC(w, c, P) ^ ((3A: + m— l)B2 + {B + x)2)m _ 

QOPT(w,c) - {mB + kB + x)2 = N >■ 
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Maximizing the function f{k) shows the upper bound for the case that Wi < ^ 
for all z G [n]. 

In case that Wi > — holds for some user z G \n\, such a user is alone on 
its link in every Nash equilibrium P, and both the user and the link can be 
omitted, increasing the coordination ratio. To show tighness, we give an instance. 

We continue by a similar result for the reciprocal case of identical users and 
related links: 

Theorem 6 (Quadratic Coordination Ratio for Pure Nash Equilibria). 

Consider the model of identical users and related links, restricted to pure Nash 
equilibria. Then, 



QSC(w,c,P) 4 
p QOPT (w,c) “ 3 ■ 

We give here only a sketch of the full proof. First, we show that no instance 
with traffic vector w = {!}", capacity vector c and pure Nash equilibrium P 
exists with Quadratic Coordination Ratio greater than |. Therefore, we assume, 
by way of contradiction, that such an instance exists, and fix the minimal (in the 
number of links) such counterexample, its worst case Nash equilibrium P and 
an optimal assignment Q. We denote the traffic of each link j by <5^(P) when 
referring to P, and by <5'^(Q) when referring to the optimum assignment. Lemma 

6 shows, that (J-’XP) is at most by one smaller than S^{Q) for any link j. Lemma 

7 shows, that for the instance under consideration, only for exactly one link k, 
5*(P) is greater than 6^{Q), and that no link has the same traffic according to P 
and Q. This implies <5^(P) = 5*(Q)+m— 1, because all remaining links must have 
(5-l(P) = <5^(Q) — 1. Lemma 8 shows, that, if not all links except for k have the 
same capacity C and the same traffic, then we can create a new instance with 
rrz — 1 identical links, having the same traffic, and one additional link, which 
has at least the same Quadratic Coordination Ratio. Hence, we can consider 
this new instance in order to bound the Quadratic Coordination Ratio of the 
original instance from above. To do so, we write down an optimization problem 
which overestimates the Quadratic Coordination Ratio of the new instance and 
includes, as constraints, the Nash equilibrium property and optimality criterion 
from Lemma 9. The optimization problem evaluates to |, which contradicts the 
initial assumption. 

To proof that the bound is tight, we construct an instance with Quadratic Co- 
ordination Ratio I for any number of links m. 

Lemma 6. Let (w, c), Q, P, S-l(Q) and <5^ (P) he as in the proof of Theorem 6. 
Then, 6^Q) — S^{P) < 1 for all j G [m]. 



Lemma 7. Let (w, c), Q, P, S-l(Q) and <5^ (P) he as in the proof of Theorem 6. 
Then, 5*(P) = <5^(Q) -I- m — 1 for some k G [m], and <5^(P) = S^Q) — 1 for all 
j G 
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Lemma 8. Let (w,c), Q, P, <5^(Q) and <5^(P) be as in the proof of Theorem 
6. Then, there exists an instance (w, c) that has the same number m of links 
as (w,c) with optimal assignment Q, Nash equilibrium assignment P, such that 
5'=(P) = 5'=(Q) + m — 1 for some link k G [m], i5*(Q) = <5'’(Q) and Ci = Cj for 



all i,j G [m]\{fc} and 



QSC(w,6.P) ^ 
-. 6 ) - 



QSC(w,c,P) 
QOPT(w,c) ■ 



Lemma 9. Let Q be any pure assignment for an instance (w, c) of the model 
of identical traffics and related links, let w = {w, . . . ,w). Then, Q is optimal, 
i.e., QSC(w,c, Q) = QOPT(w,c), if and only if for every pair of links i,j G [m] 

{5\Cl) + wf ^ {5^{Cl)-wf ^ {m)f ^ {5^m\ 

Ci Cj Ci Cj 

We next prove: 

Theorem 7. Consider the model of arbitrary users and identical links. Then, 

QSC(w,c,F) 1 

max ^ <2 . 

w,c QOPT(w,c) m 



We next prove: 

Theorem 8. Consider the model of identical users and identical links. Then, 
for any traffic vector w, capacity vector c and mixed Nash equilibrium P, 



QSC (w, c, P) 
QOPT (w,c) 



< 1 + min 



m — 1 n — 1 



n 



TO 



1 

<2 . 

TO 



Acknowledgments. We thank Rainer Feldmann and Martin Gairing for several 
helpful discussions. 



References 

1. N. Alon, Y. Azar, G. J. Woeginger and T. Yadid, Approximation Schemes for 
Scheduling, Proc. of SODA 1997, pp. 493-500. 

2. E. Anshelevich, A. Dasgupta, E. Tardos, and T. Wexler, Near-Optimal Network 
Design with Selfish Agents, Proc. of STOC 2003, pp. 511-520. 

3. M. J. Beckmann, On the Theory of Traffic Flow in Networks, Traffic Quart, Vol. 21, 
pp. 109-116, 1967. 

4. M. Beckmann, C. B. McGuire and C. B. Winsten, Studies in the Economics of 
Transportation, Yale University Press, 1956. 

5. D. Braess, Uber ein Paradoxen aus der Verkehrsplannng, Untemehmensforschung, 
Vol. 12, pp. 258-268, 1968. 

6. A. K. Chandra and C. K. Wong, Worst-case Analysis of a Placement Algorithm 
Related to Storage Allocation, SICOMP 1975, Vol. 4, pp. 249-263. 

7. E. Altman, T. Basar, T. Jimenez and N. Shimkin, Competitive Routing in Net- 
works with Polynomial Costs, IEEE Transactions on Automatic Control, Vol. 47, 
pp. 92-96, 2002. 




558 



T. Lucking et al. 



8. R. A. Cody and E. G. Coffman, Jr., Record Allocation for Minimizing Expected 
Retrieval Costs on Crum-Like Storage Devices, JACM, Vol. 23, pp. 103-115, 1976. 

9. A. Czumaj and B. Vocking, Tight Bounds for Worst-Case Equilibria, Proc. of 
SODA 2002, pp. 413-420. 

10. A. Czumaj, P. Krysta and B. Vocking, Selfish Traffic Allocation for Server Farms, 
Proc. of STOC 2002, pp. 287-296. 

11. S. C. Dafermos and F. T. Sparrow, The Traffic Assignment Problem for a Gen- 
eral Network. Journal of Research of the National Bureau of Standards, Series B, 
Vol. 73B, pp. 91-118, 1969. 

12. E. Even-Dar, A. Kesselman and Y. Mansour, “Convergence Time to Nash Equi- 
libria,” Proc. of ICALP 2003, J. C. M. Baeten, J. K. Lenstra, J. Parrow and 
G. J. Woeginger eds., Vol. 2719, pp. 502-513. 

13. A. Fabrikant, A. Luthra, E. Maneva, C. H. Papadimitriou, and S. Shenker, On a 
Network Creation Game, Proc. of PODC 2003, pp. 347-351. 

14. R. Feldmann, M. Gairing, T. Liicking, B. Monien and M. Rode, Nashffication 
and the Coordination Ratio for a Seffish Routing Game, Proc. of ICALP 2003, 
Vol. 2719, pp. 514-526. 

15. R. Feldmann, M. Gairing, T. Lucking, B. Monien and M. Rode, Selfish Routing in 
Non-Cooperative Networks: A Survey, Proc. of MFCS 2003, Vol. 2747, pp. 21-45. 

16. D. Fotakis, S. Kontogiannis, E. Koutsoupias, M. Mavronicolas and P. Spirakis, The 
Structure and Complexity of Nash Equilibria for a Seffish Routing Game, Proc. of 
ICALP 2002, Vol. 2380, pp. 123-134. 

17. M. Gairing, T. Lucking, M. Mavronicolas, B. Monien and P. Spirakis, The Struc- 
ture and Complexity of Extreme Nash Equilibria, submitted for publication, 2003. 

18. E. Koutsoupias, M. Mavronicolas and P. Spirakis, Approximate Equilibria and Ball 
Fusion, Proc. of SIROCCO 2002, pp. 223-235. 

19. E. Koutsoupias and C. H. Papadimitriou, Worst-case Equilibria, Proc. of STACS 
1999, G. Meinel and S. Tison eds., Vol. 1563, pp. 404-413. 

20. J. Y. T. Leung and W. D. Wei, Tighter Bounds on a Heuristic for a Partition 
Problem, Information Processing Letters, Vol. 56, pp. 51-57, 1995. 

21. T. Lucking, M. Mavronicolas, B. Monien, M. Rode, P. Spirakis and I. Vrto, Which 
is the Worst-case Nash equilibrium?, Proc. of MFCS 2003, B. Rovan and P. Vojtas 
eds., Vol. 2747, pp. 551-561. 

22. M. Mavronicolas and P. Spirakis, The Price of Selfish Routing, Proc. of STOC 
2001, pp. 510-519. 

23. J. F. Nash, Equilibrium Points in V-Person Games, Proceedings of the National 
Academy of Sciences, Vol. 36, pp. 48-49, 1950. 

24. J. F. Nash, Non-cooperative Games, Annals of Mathematics, Vol. 54, No. 2, pp. 
286-295, 1951. 

25. M. J. Osborne and A. Rubinstein, A Course in Game Theory, The MIT Press, 
1994. 

26. C. H. Papadimitriou, Algorithms, Games and the Internet, Proc. of STOC 2001, 
pp. 749-753. 

27. T. Roughgarden, The Price of Anarchy is Independent of the Network Topology, 
Proc. of STOC 2002, pp. 428-437. 

28. T. Roughgarden, Selfish Routing, Ph. D. Thesis, Department of Computer Science, 
Cornell University, May 2002. 

29. T. Roughgarden and E. Tardos, How Bad is Seffish Routing? JACM, Vol. 49, pp. 
236-259, 2002. 

30. J. G. Wardrop. Some Theoretical Aspects of Road Traffic Research, Proceedings of 
the of the Institute of Civil Engineers, Pt. II, Vol. 1, pp. 325-378, 1952. 




Broadcast in the Rendezvous Model 



Philippe Duchon, Nicolas Hanusse, Nasser Saheb, and Akka Zemmari 

LaBRI - CNRS - Universite Bordeaux I, 351 Cours de la Liberation, 33405 Talence, 
France, {duchon, hanusse , saheb, zeminari}@labri .f r 



Abstract. In many large, distributed or mobile networks, broadcast 
algorithms are used to update information stored at the nodes. In this 
paper, we propose a new model of communication based on rendezvous 
and analyze a multi-hop distributed algorithm to broadcast a message in 
a synchronous setting. In the rendezvous model, two neighbors u and v 
can communicate if and only if u calls v and v calls u simultaneously. 
Thus nodes u and v obtain a rendezvous at a meeting point. 

If m is the number of meeting points, the network can be modeled by a 
graph of n vertices and m edges. At each round, every vertex chooses a 
random neighbor and there is a rendezvous if an edge has been chosen 
by its two extremities. Rendezvous enable an exchange of information 
between the two entities. We get sharp lower and upper bounds on the 
time complexity in terms of number of rounds to broadcast: we show 
that, for any graph, the expected number of rounds is between logn and 
O(n^). For these two bounds, we prove that there exist some graphs for 
which the expected number of rounds is either O(logn) or l?(n^). For 
specific topologies, additional bounds are given. 

Keywords: Algorithms and data structures, distributed algorithms, 
graph, broadcast, rendezvous model 



1 Introduction 

Among the numerous algorithms to broadcast in a synchronized setting, we 
are witnessing a new tendency of distributed and randomized algorithms, also 
called gossip-based algorithms: at each instant, any number of broadcasts can 
take place simultaneously and we do not give any priority to any particular one. 
In each round, a node chooses a random neighbor and tries to exchange some 
information. Due to the simplicity of gossip-based algorithm, such an approach 
provides reliability and scalability. Contrary to deterministic schemes for which 
messages tend to route in a particular subgraph (for instance a tree), a gossip- 
based algorithm can be fault-tolerant (or efficient for a dynamic network) since 
in a strongly connected network, many paths can be used to transmit a message 
to almost every node. 

The majority of results deal with the uniform random phone call for which 
a node chooses a neighbor uniformly at random. However, such a model does 
not take into account that a given node could be “called” by many nodes si- 
multaneously implying a potential congestion. A more embarrassing situation is 
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the one of the radio networks in which a node should be called simultaneously 
by a unique neighbor otherwise the received messages are in collision. In the 
rendezvous model, every node chooses a neighbor and if two neighbors choose 
themselves mutually, they can exchange some information. The rendezvous model 
is useful if a physical meeting is needed to communicate as in the case of robots 
network. 

Although the rendezvous model can be used in different settings, we describe 
the problem of broadcasting a message in a network of robots. A robot is an 
autonomous entity with a bounded amount of memory having the capacity to 
perform some tasks and to communicate with other entities by radio when they 
are geographically close. Examples of use of such robots are numerous: explo- 
ration [1,7], navigation (see Survey of [16]), capture of an intruder [3], search for 
information, help to handicapped people or rescue, cleaning of buildings, ... The 
literature contains many efficient algorithms for one robot and multiple robots 
are seen as a way to speed up the algorithms. However, in a network of robots 
[4], the coordination of multiple robots implies complex algorithms. Rendezvous 
between robots can be used in the following setting: consider a set of robots 
distributed on a geometric environment. Even if two robots sharing a region 
of navigation (called neighbors) might communicate, they should also be close 
enough. It may happen that their own tasks do not give them the opportunity 
to meet (because their routes are deterministic and never cross) or it may take a 
long time if they navigate at random. A solution consists in deciding on a meet- 
ing point for each pair of neighbor robots. If two neighbors are close to a given 
meeting point at the same time, they have a rendezvous and can communicate. 

Although there exist many algorithms to broadcast messages, we only deal 
with algorithms working under a very weak assumption: each node or robot only 
knows its neighbors or its own meeting points. This implies that the underlying 
topology is unknown. Depending on the context, we might also be interested 
in anonymous networks in which the labeling of the nodes (or history of the 
visited nodes) is not used. By anonymous, we mean that unique identities are not 
available to distinguish nodes (processors) or edges (links). In a robot network, 
the network can have two (or more) meeting points with the same label if the 
environment contains two pairs of regions that do not overlap. The anonymous 
setting can be encountered in dynamic, mobile or heterogeneous networks. 



1.1 Related Works 

How to broadcast efficiently a message with a very poor knowledge on the topol- 
ogy of an anonymous network ? Depending on the context, this problem is 
related to the way a “rumor” or an “epidemic” spreads in a graph. In the liter- 
ature, a node is contaminated if it knows the rumor. The broadcast algorithm 
highly depends on the communication model. For instance, in the k-ports model, 
a node can send a message to at most k neighbors. Thus our rendezvous model 
is a 1-port model. 

The performance of a broadcast algorithm is measured by the time required 
to contaminate all the nodes, the amount of memory stored at each node or 
the total number of messages. In this article, we analyze the time complexity 
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in a synchronous setting of a rendezvous algorithm (although several broadcast 
algorithms including ours can work in an asynchronous setting, the theoretical 
time complexity is usually analyzed in a synchronous model) . 

Many broadcast algorithms exist (see the survey by Hedetniemi et al. [10]) 
but few of them are related to our model. The closest model is the one of Feige et 
al. [8] . The authors prove general lower and upper bounds (log 2 n and 0{n log n)) 
on the time to broadcast a message with high probability in any unknown graph. 
A contaminated node chooses a neighbor uniformly at random but no rendezvous 
are needed. In our model, the time complexity increases since a rendezvous has 
to be obtained to communicate. For a family of small-world graphs and other 
models (2-ports model but a node can only transmit a given message a bounded 
number of times), Cornelias et al. [6] showed that a broadcast can always be 
done. A recent work of Karp et aZ.[ll] deals with the random phone call model. 
In each round, each node u chooses another node v uniformly at random (more 
or less as in [8]) but the transmission of a rumor is done either from the caller 
to the called node {push transmission algorithm) or from the called node to the 
caller {pull transmission algorithm). The underlying topology is the complete 
graph and they prove that any rumor broadcasted in O(lnn) rounds needs to 
send uj{n) messages on expectation. 

However, the results of random call phone [8,11] do not imply the presented 
results in the rendezvous model: 

— The classes of graphs for which the broadcast runs fast or slowly are different 
in the rendezvous model and in the random phone call model. For instance, 
the lower bound is f?(logn) for the two models but for the complete graph, 
the broadcast time of O(logn) is close to the lower bound in the random 
phone call model whereas it becomes 0{nlogn) in the rendezvous model. 

— We deal with the expected broadcast time. Depending on the topology, this 
time can be either equal or different to the broadcast time with high proba- 
bilityb 

In the radio network setting (n-ports model), some algorithms and bounds 
exist whether the topology is known or unknown (see the survey of Chlebus [5]). 
However, the model of communication is different from ours: simultaneously, a 
node can send a message to all of its neighbors and a node can receive a message 
if and only if a unique neighbor send a message. Two kinds of algorithms are 
proposed in the radio model : with or without collision detection. In our model, 
there is no problem of collision. 

Rendezvous in a broadcast protocol are used in applications like Dynamic 
Host Configuration Protocol but to the best of our knowledge, the analysis 
of a randomized rendezvous algorithm to broadcast in a network is new. The 
random rendezvous model was introduced in [13] in which the authors com- 
pute the expected number of rendezvous per round in a randomized algorithm. 
Their algorithm is a solution to implement synchronous message passing in an 
anonymous network that passes messages asynchronously [17]. Many concur- 
rent programming languages including CSP and Ada use this method to define 

^ High probability means with probability 1 — 0{n~‘^) for some positive constant c. 
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a communication between pairs of asynchronous processes. Angluin [2] proved 
that there is no deterministic algorithm for this problem (see the paper of Lynch 
[12] containing many problems having no deterministic solutions in distributed 
computing) . In [14], the rendezvous are used to elect randomly a leader in an 
anonymous graph. 



1.2 The Model 

Let G = {V,E) be a connected and undirected graph of n vertices and m edges. 
For convenience and with respect to the problem of spreading an epidemic, a 
vertex is contaminated if it has received the message sent by an initial vertex vq- 
The model can be implemented in a fully distributed way. The complexity 
analysis, however based on the concept of rounds, is commonly used in similar 
studies [8,13,14]. In our article, a round is the following sequence: 

— for each v G V, choose uniformly at random an incident edge. 

— if an edge (vi,Vj) has been chosen by Vi and vj, there is a rendezvous. 

~ if there is a rendezvous and if only Vi is contaminated, then Vj becomes 
contaminated. 

Tq is the broadcast time or contamination time, that is the number of rounds 
until all vertices of graph G are contaminated. Tq is an integer-valued random 
variable; in this paper, we concentrate the study on its expectation E(Tg). 

Some remarks can be made on our model. As explained in the introduction, 
the rendezvous process (the first two steps of the round) keeps repeating forever 
and should be seen as a way of maintaining connectivity. Several broadcasts can 
take place simultaneously and we do not give any priority to any one of them, 
even if we study a broadcast starting from a given vertex vq. 

We concentrate our effort on E(Tg) and we do not require that the algorithm 
finds out when the rumor sent by Vq has reached all the nodes. However some 
hints can be given: we can stop the broadcast algorithm (do not run the third 
step of the round) using a local control mechanism in each node of the network: 
if identities of the nodes are available (non anonymous networks), each node 
keeps into its memory a list of contaminated neighbors for each rumor and when 
this list contains all the neighbors, the process may stop trying to contaminate 
them (with the same rumor). If the network is anonymous and the number of 
nodes n is known, then it is possible to prove that in 0(n^ log n) rounds with 
high probability, all the neighbors of a contaminated node know the rumor. 

In our algorithm, nodes of large degree and a large diameter increase the 
contamination time. Taking two adjacent nodes Vi and vj of degrees di and dj 
respectively, the expected number of rounds to contaminate Vj from Vi is didj. 
For instance, take two stars of n/2 leaves. Join each center by an edge. In the 
rendezvous model, the expected broadcast time is 0{n^) whereas in [8]’s model, 
it will be 6>(nlogn) on expectation and with high probability. Starting from this 
example, E(Tg) can easily be upper bounded by O(n^) but we find a tighter 
upper bound. 
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1.3 Our Results 

The main result of the paper is to prove in Section 2 that for any graph G, 
log 2 n < E(rG) < O(n^). More precisely, for any graph G of maximal degree A, 
E(Tg) = 0{An). 

In Section 3, we show that there are some graphs for which the expected 
broadcast time asymptotically matches either the lower bound or the upper 
bound up to a constant factor. For instance, for the complete balanced binary 
tree, E(Tg) = 0 (log 2 n) whereas E(Tg) = I7(n^) for the double star graph (two 
identical stars joined by one edge). For graphs of bounded degree A and diameter 
D, we also prove in Section 3 that E(T( 3 ) = 0{DA'^ lnZ\). This upper bound is 
tight since for Z\-ary complete trees of diameter D, E(Tg) = Q{DA'^h\A). The 
complete graph was proved [13] to have the least expected number of rendezvous 
per round; nevertheless, its expected broadcast time is 0{n\an). Due to space 
limitations, proofs of lemmas and corollaries are not given. 



2 Arbitrary Graphs 

The first section presents some terminology and basic lemmas that are useful for 
the main results. 



2.1 Generalities on the Broadcast Process 

The rendezvous process induces a broadcast process, that is, for each nonnegative 
integer t, we get a (random) set of vertices, Vj, which is the set of vertices 
that have been reached by the broadcast after t rounds of rendezvous. The 
sequence (Vt)tgN is a homogeneous, increasing Markov process with state space 
{[/:0C[/CP}. Any state U contains the initial vertex vq and the subgraph 
induced by U is connected. State V as its sole absorbing state; thus, for each 
graph G, this process reaches state V (that is, the broadcast is complete) in 
finite expected time. 

The transition probabilities for this Markov chain (14) depend on the ren- 
dezvous model. Specifically, if U and U' are two nonempty subsets of V, the 
transition probability Pu,u' is 0 if [/ ^ U', and, if U C [/', pu^ij/ is the proba- 
bility that, on a given round of rendezvous, U' — U is the set of vertices not in 
U that have a rendezvous with a vertex in U. Thus, the loop probability pu,u is 
the probability that each vertex in U either has no rendezvous, or has one with 
another vertex in U . 

In the sequel, what we call the broadcast sequence is the sequence of distinct 
states visited by the broadcast process between the initial state {uq} and the final 
absorbing state V . A possible broadcast sequence is any sequence of states that 
has a positive probability of being the broadcast sequence; this is any sequence 
X = (Ai, . . . ,Xm) such that Xi = Vq = {uq}, Xm = V, and px^.x^+i > 0 for 
all k. 

By du we denote the degree of vertex u. For a bounded degree graph, A is 
the maximal degree of the graph. By D we denote the diameter of the graph. 




564 



P. Duchon et al. 



If Xk = Vt is the set of the k contaminated vertices at time t then Yk is the 
set of remaining vertices. We define the cut Ck as the set of edges that have one 
endpoint in Xk and the other in Yk- 

For any edge a = (u,v) G E, P(a) = {dudy)~^ (resp. P(a)) is the probability 
that edge a will obtain (resp. not obtain) a rendezvous at a given round. The 
product {dudy)~^ is also called the weight of the edge a. 

We also define two values for any set of edges C C E ■. F{£c) (resp. P(fc)) 
where £c is the event of obtaining a rendezvous in a round for at least one edge 
(resp. no edge) in C; and n{C) = X)aeC^(®)- While tt{C) has no direct proba- 
bilistic interpretation, it is much easier to deal with in computations. Obviously, 
'^{£c) < holds for any C. Lemma 2 provides us with a lower bound for 

F{£c) of the form f2{n{C)) provided 7r(C) is not too large. 

With these notations, for any set of vertices U, pu,u = 1 — ^(^Cu)? where 
Cu is the set of edges that have exactly one endpoint in U (the cut defined by 
the partition ([/, V — U)). 

Lemma 1. Let a € E and for any C C E, P(a | £c) > P(a). 



Lemma 2. For any C C E, F{£c) > Amin(l, 7r(C)) with A = 1 — e ^ where 
e = exp{l) . 



Corollary 1. 



Lemma 3. For any graph G, any integer k and any p G (0, 1), z/P(Tg > k) < p 
then E(Tg) < k/{l — p). 

Since the number of contaminated vertices can be at most doubled at each 
round, we have the following trivial lower bound 

Theorem 1. For any graph G, Tq > log 2 n with probability 1. 



2.2 The General Upper Bound 

We will prove the following : 

Theorem 2. For any connected graph G with n vertices and maximum degree 
A, the broadcast time Tq satisfies 

E(Tg)< ^(n-l)(6Z\+l). 
e — 1 



(2) 
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The proof of this theorem is a bit involved; we will sketch it before stating 
and proving a few lemmas. 

The probability distribution for the full broadcast time Tq is not known, but, 
when conditioned by the sequence of states visited by the broadcast process, it 
becomes a sum of independent geometric random variables, for which the pa- 
rameters are known exactly (Lemma 4). Thus, the conditional expectation of the 
broadcast time becomes the weight of some trajectory, which is defined as a sum 
of weights for the visited states. Each individual weight is upper bounded by an 
expression that only depends on individual rendezvous probabilities (Lemma 2 
and Corollary 1), and then a uniform upper bound is obtained for the condi- 
tional expectations (Lemma 5) ; this uniform upper bound then straightforwardly 
translates into an upper bound for the (unconditional) expected broadcast time. 

The next lemma is stated in a more general setting than our broadcasting 
process. 

Lemma 4. Let be « homogeneous Markov chain with finite state space 

S and transition probabilities {px,y)x,yeS ■ 

Let (Tk)ken denote the increasing sequence of stopping times defined by 

To = 0 

Tk+i = inf{t > Tk : Mt 

and let {Mjfik&fi be the “trajectory” chain defined by 

^ \ ifTk<oo, 

ifTk = <x. 

Then, for any sequence xq,...xn such that Xk+i and Pxk,xk+i > 0 

for 0 < k < N — 1, conditioned on = Xk for 0 < k < N, T = {Tk+i — 
Tk)o<k<N-i is distributed as a vector of independent geometric random variables 
with respective parameters 1 — Pxk,xk- 

Corollary 2. LetV denote the trajectory of the loopless broadcast process (writ- 
ten M' in the statement of Lemma j). 

Let X = (Xi,... ,Xm) be any possible broadcast sequence, and C = 
(Cl, . . . , Cm-i) the corresponding sequence of cuts. Then 

m— 1 - 

Lemma 5. Define the weight of any possible broadcast sequence X as 

m—1 - 

»(V) . g (3) 

Then 



w{X) < 6{n — 1)Z\, 
where A is the maximum degree of G. 



(4) 
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Proof. We begin by noting that, since we are looking for a uniform upper bound 
on the weight, we can assume that m = n, which is equivalent to \X^\ = k for all 
k. If such is not the case in a sequence X, then we can obtain another possible 
sequence X' with a higher weight by inserting an additional set X' between any 
two consecutive sets Xk and X^+i such that \Xk+i~Xk\ > 2 (with the condition 
that PXk.x' and Px',Xk+i are both positive; such an X' always exists, because 
each edge of every graph has positive probability of being the only rendezvous 
edge in a given round). This will just add a positive term to the weight of the 
sequence; thus, the sequence with the maximum weight satisfies m = n. 

To prove that J2k=i ^/x{Ck) < 6(n— 1)Z\, we prove that the integer interval 
[l,n — 1] can be partitioned into a sequence of smaller intervals, such that, on 
each interval, the average value of l/7r(C'fc) is at most 6 A. 

Assume that integers 1 to fc — 1 have been thus partitioned, and let us con- 
sider Cfc. If Tr(C'fc) > I/(4Z\) (that is, I/7r(Cfc) < 4A < 6A), we put k into an 
interval by itself and move on to fc -I- 1. We now assume Tr(C'fc) < 1/(4Z\), and 
set l/7r(Cfc) = a A with a > 4. 

Let V be the next vertex to be reached by the broadcast after Xk, that is, 
{u} = Afe_|_i — Xfc. This vertex must have at least one neighbor u in X/-. 

Let d > 1 denote the number of neighbors of v that are in Each edge 
incident to v has weight at least l/(d„Z\), and d of them are in Ck, so that we 
have d/{dyA) < 7r(Ck) = l/(aZ\), or equivalently, 

d < dvja. ( 5 ) 



Thus, V € Afc+i has dy — d neighbors in Tfc+i = V — X^^i. Since at most 
one of them is added to X at each step of the sequence, this means that, for 
0 < j < dy — d, Yk+i+j contains at least dy — d — j neighbors of v. In other 
words, Ck+i+j contains at least dy — d — j edges that are incident to v, each of 
which has weight at least l/{dyA). Consequently, 



1 ^ dyA 

Tr{Ck+i+j) ~ dy- d- j 



(6) 



holds for 0 < j < dy — d. 

The right-hand side of (6) increases with j, and for j = [dy/4\ (recall eq. (5) 
and a > 4), it is 



dyA ^ dyA 

dy- d - \dy/A\ ~ dy - 2 /4j 

^ dyA 

- K/21 

< 2Z\. 



Summing (6) over 0 < j < \dy/4\, we obtain 



1A/4J ^ 

^ 7r(C'fe + l+j) 



<2a(i+|). 



( 7 ) 
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Since dy > a, we also have l/7r(C'fc) < dyA. Adding this to 7, we now get 



1 



E 



0<j<K/4J 



’’■(C'fc + 1 + j) 



<Z\ 0 + 2+^ 



3d, 



< Z\ 2 



There are 2 + /4J > 1 + ^ terms in the left-hand side of this inequality, 

so that the average value of l/7r(Cj), when i ranges over [A:, fc -I- 1 -I- [d„/4j], is 
at most 

A |-<6Z\. (8) 

1 + t - 

This concludes the recursion, and the proof. 



Proof (Theorem 2). 

Let X be any possible broadcast sequence as in Lemma 5. Applying Corol- 
lary 1 to C = Cfc and summing over k, we get 



E 



1 




By Lemma 5, the right-hand side of (9) is at most 



— ^ — (n — 1 -I- 6Z\(n — 1)) 
e — 1 



e(n — 1)(6Z\ -I- 1) 
e — 1 



(9) 



(10) 



By Lemma 4, the left-hand side of (9) is the conditional expectation of Tq- 
The upper bound remains valid upon taking a convex linear combination, so 
that we get, as claimed. 



E(Tg) < 



e(n 



1)(6Z\+ 1) 

e — 1 



( 11 ) 



Note: It should be clear that the constants are not best possible, even with our 
method of proof. They are, however, quite sufficient for our purpose, which is to 
obtain a uniform bound on the expected broadcast time. 



3 Specific Graphs 

Theorems 1 and 2 provide lower and upper bounds on the expected contamina- 
tion time for any graph. In this section, we prove that there exists some graphs 
for which the bounds can be attained. 

The well-known coupon-collector problem (that is the number of trials re- 
quired to obtain n different coupons if each round one is chosen randomly and 
independently. See [15] for instance) implies the next lemma: 

Lemma 6. For a star S of n leaves, E(T 5 ) = n\an + 0(n). 
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3.1 The i-Star Graphs 

An Z-star graph Si is a graph built with a chain of Z + 2 vertices. Then for 
each vertex different to the extremities, A — 2 leaves are added. Let S[ be a 
Z-star graphs with n = 1{A — 1) + 2 vertices. According to Theorem 2, E(T 5 j) = 
0{An) = O(^). On the other hand, the expected number of rounds to get 
a rendezvous between centers of two adjacent stars is A^ and, therefore, the 
expected number of rounds for contaminating all the centers is Q{IA^) = Q{nA). 
As a corollary to this result we have 

Proposition 1. There exists an infinite family of graphs T of n vertices and 
maximal degree A such that, for any G & T, E(T( 3 ) = Q{An). 

It follows that the general upper bound 0{n^) given by Theorem 2 is tight for 
the any Z-star graph with Z > 2 constant. 

3.2 Matching the Lower Bound 

To prove that the 17 (log n) bound is tight, we prove an upper bound that only 
involves the maximum degree A and the diameter D. 

Theorem 3. Let G he any graph with maximum degree Z\ > 3 and diameter 
D. Then the expected broadcast time in G, starting from any vertex, is at most 
4A2 {ln2 + D + DlnA). 

Our proof of this theorem will make use of the following lemma. 

Lemma 7. Fix a constant p > 0, and let Sk denote the sum of k independent 
geometric random variables with parameter p. 

Then, for any t > k/p, we have 

)■ 

Proof (Theorem 3). We prove that the probability for the broadcast time to 
exceed half of the claimed bound is at most 1/2 and then use Lemma 3. 

Let u be the initial vertex for the broadcast. For each other vertex v, pick 
a path juy from u to v with length at most D. Since all degrees are at most 
A, each edge in has a rendezvous probability at least \jA?. Hence, the 
broadcast time from m to v along the path (that is, the time until the first 
edge has a rendezvous, then the second edge, and so on) is distributed as the 
sum of independent geometric random variables with parameters equal to the 
rendezvous probabilities, and is thus stochastically dominated by the sum of D 
independent geometric random variables with parameter \jA?. 

Let denote the time until broadcast reaches v when the initial vertex is 
u; Lemma 7 and the above discussion imply, for any t, 

P(T„^ > Z) < e ^ . (12) 
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Let 1 + n denote the number of vertices in G. Moore’s bound ensures that 
n < . 

It is routine to check that, if t > 2Z\^(ln 2+D+D In A), then t{l — DA“^ /t)"^ > 
2Z\^(ln2 + Din A) > 2Z\^ ln(2n). Thus, for each of the n vertices v ^ u, we get 

P(T„„>t)<e-'"(2n) = 1 , (13) 

2n 

so that, summing over we get 

P(T„>t)<i. (14) 

Corollary 3. There exists an infinite family of graphs T such that, for any 
G&T, E(TG) = 0(log|G|). 

3.3 The Complete Graph 

It is seems also interesting to point out that the complete graph has the 
minimal (see [13]) expected rendezvous number in a round: 

which is asymptotically We prove in this section that its expected broadcast 
time is however O(nlnn), which is significantly shorter than that of the Lstar 
graph with I constant which is f?(n^). 

Lemma 8. TK{Tk^) < 2X~^nlnn + 0{n). 

Moreover, we have: 

Lemma 9. With prohahility 1 — E(T^„) > \nlnn. 

Lemmas 9 and 8 imply: 

Proposition 2. ¥,(Tk„) = 0(nlnn). 

3.4 Graphs of Bounded Degree and Bounded Diameter 

Lemma 10. Let G be a A-regular balanced complete rooted tree of depth 2. The 
expected time for the root to contaminate its children is 6>(Z\^lnZ\). 

Theorem 4. Let G be a A-regular balanced complete rooted tree of depth D/2 
with D even. E(T( 5 ) = D{DA‘^ In A) . 

Proof. Suppose the broadcast starts from the root Vq. Let us construct a path 
vq,Vi,V 2 , ■ . ■ ,V 0/2 such that Vi is the last contaminated child of T„. de- 
notes the number of rounds to contaminate Vj by its parent Vi-i once Vi-\ is 
contaminated. Since Tq > Ty^ and from Lemma 10, for every 1 < i < D/2, 

E(T„J = 0{A^lnA), we have E(Tg) > = n{DA^ In A). 

Theorem 4 proves that there exists a graph for which the upper bound of 
Theorem 3 is tight. 
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Abstract. Nisan [6] showed that any randomized logarithmic space 
algorithm (running in polynomial time and with two-sided error) can 
be simulated by a deterministic algorithm that runs simultaneously in 
polynomial time and ©(log^n) space. Subsequently Saks and Zhou [9] 
improved the space complexity and showed that a deterministic simu- 
lation can be carried out in space 0(log^ ®n). However, their simula- 
tion runs in time "b We prove a time-space tradeoff that in- 

terpolates these two simulations. Specifically, we prove that, for any 
0 < a < 0.5, any randomized logarithmic space algorithm (running in 
polynomial time and with two-sided error) can be simulated determinis- 
tically in time and space 0(log^ ®+“n). That is, we prove 

that BPL C DTISP[n°('°®°'"““">,0(logi ®+“n)]. 

1 Introduction 

Given an undirected graph and vertices s and t, the undirected st-connectivity 
problem is to check whether there is a path from s to t. It is one of the most 
fundamental problems in computer science and has been well-studied for the 
past three decades. Procedures like depth-first search and breadth-first search 
solve the problem in polynomial time, but use linear space. On the other hand, 
Savitch’s theorem [10] gives an algorithm (even for directed graphs) that uses 
only O(log^n) space. However, the algorithm runs in time 

It remains open whether undirected st-connectivity can be solved in logarith- 
mic space. It is known that this can be accomplished if randomness is allowed. 
Aleliunas et al. [1] showed that the problem can be solved in randomized loga- 
rithmic space (with one-sided error). Most subsequent improvements have been 
via deterministic simulation of the randomized logspace algorithm. 

In a major breakthrough, Nisan [5] constructed a pseudorandom generator 
that stretches a seed of length O(log^n) into a string of polynomial length. He 
showed that the output of the generator (on a randomly chosen seed) looks al- 
most truly random for any randomized logspace computation (with two-sided 

* A more detailed version of the paper is available at 
http://www.cs.wisc.edu/~dieter. Research supported in part by NSF grants 
CCR-9634665, CCR-0208013 and CCR-0133693. 
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error). Then, if we run through all possible seeds, we get a deterministic simu- 
lation. This naive simulation using Nisan’s generator needs O(log^n) space and 
^ 0 (iogn) hence fails to beat the bounds from Savitch’s theorem. How- 

ever, Nisan [6] showed that, in fact, one can perform the simulation in polyno- 
mial time, while using only O(log^n) space. This settled an important problem 
of whether randomized logspace algorithms can be simulated in simultaneous 
polynomial time and polylog space. Subsequently, Nisan, Szemeredi and Wigder- 
son [7] presented an 0(log^’®n) space algorithm for undirected st-connectivity. 
In a sweeping generalization, Saks and Zhou [9] proved that any randomized 
logspace algorithm can be simulated deterministically in O(log^'^n) space. But 
the time complexity of this simulation is In this paper, we generalize 

the deterministic simulations of Nisan [6], and of Saks and Zhou [9], and exhibit 
a time-space tradeoff interpolating the two. 

We denote by BPL the class of all languages that can be recognized by a 
randomized algorithm with two-sided error that uses logarithmic space and runs 
in polynomial time (for any input and for any random coin tosses). It contains 
RL, the one-sided error class. We refer to the excellent survey on probabilistic 
space bounded complexity classes by Saks [8] for more background. For a time 
bound t{n) and space bound s(n), let DTISP[t, s] denote the class of languages 
that can be recognized by a deterministic algorithm that runs in time t(n) and 
uses space s(n). With this notation, we can restate the previous results as: 

- BPL C DTISP[n<^(i),0(log^n)] (Nisan [6]). 

- BPL C DTISP[n'^('°s“'=")^0(lQgi.5^)] 2hou [9]). 

In this paper, we establish a time-space tradeoff that generalizes the above two 
results: 

Theorem 1. BPL C DTISP[n'^^*°®” ^ 0(log^'^~'’“n)], for any rational num- 

ber 0 < a < 0.5. 

2 Preliminaries 

As is standard in sublinear space bounded computation, we use an off-line Turing 
machine model to measure space complexity. Thus, the input is considered given 
on a separate read-only tape and does not count towards the space complexity. 
Randomness is provided by a one-way read-only tape containing unbiased and 
uncorrelated random bits. The random bit tape does not count towards the space 
complexity either. 

As is common in prior works, we model the computation of a randomized 
logspace polynomial-time Turing machine on an input w as a finite state automa- 
ton, and analyze the automaton’s behavior using its transition matrix. This way, 
the issue of deterministic simulation reduces to approximate matrix powering. 
The following definitions and facts will be used for that purpose. 

Definition 1. For positive integers d and m, a (d, m)-automaton Q is a map- 
ping Q : {0, 1, . . . , d} X {0, 1}'" — > {0,1,..., d}, such that, for all a G (0, 1}™, 
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Q(0, a) = 0. We say that Q has d -I- 1 states numbered 0, 1, 2, . . . , (i. State Q is a 
special state called the dead state of Q. We define a mapping Q : {0, 1, . . . ,d} x 
({0, 1}™)* — >. {0,1,..., d}, recursively as follows. For i G (0, 1, . . . , d}, we de- 
fine Q{i,e) = i, where e denotes the empty string. Let i € (0,1,..., d} and 
aia 2 . . . at G ({0, 1}’”)’*', where each at G {0,1}™, and t is a positive integer. 
Then Q{i, aia 2 ...«() = Q{Q{i, aia 2 . . . at-i),at). 



Definition 2. For a d x d matrix M over the real numbers, define the norm of 
M to be 



\\M\\ 



max 

Ki<d 









where M[i,j] is the (i,j) entry of M. The matrix M is said to be sub-stochastic 
if ll-^ll < 1 M[i,j] > 0 for any i,j G {1, 2, ... , d}, i.e., M has non-negative 
entries with each row sum at most 1. Such a matrix is said to be a (d, m)-matrix 
if for any i,j G {1,2, . . . ,d|, 2™M[i, j] is an integer. 



Definition 3. Let Q be a (d,m)- automaton. The transition matrix of Q, de- 
noted by Ai(Q), is the {d,m)-matrix M with, for i,j G {1,2, .. . ,d|, M[i,j] = 
Prae{o,i}-[Q(ba) = j]. 

We will exploit the following connection between powers of the transition matrix 
M = A4{Q) and “powers” of Q: For any nonnegative integer k, 

M%j]= Pr [g(*,s)=j], (1) 

sG({0,l}™)*’ 

i.e., the (i,j) entry of equals the probability that a random string of length 
k over the alphabet {0, 1}™ “moves” Q from i to j. 

Note that different (d, m)-automata Q may have the same transition matrix 
A4(Q). We will need to invert the operator A4 in a canonical way. 

Definition 4. Let M be a {d, m) -matrix. The canonical automaton correspond- 
ing to M, denoted by Q{M), is a (d,m) -automaton Q defined as follows. Let 
i G {1, 2, . . . , dj and a G {0, 1}™. Let a denote the number with binary represen- 
tation .a. Lf a > '^j^i M[i,j], then Q(i,a) = 0; otherwise, let k G {1,2,. ..,d| 

be the least integer such that a < then we let Q{i,a) = k. An 

automaton Q is called canonical if it is the canonical automaton corresponding 
to some M . 

Note that Ai{Q{M)) = M for any (d, m)-matrix M. By applying Q once more 
it follows that, for a canonical Q, Q(A4(Q)) = Q. 
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3 Overall Approach and Main Ingredients 

Given a randomized logspace, polynomial-time, two-sided error Turing machine, 
and an input w of length n, we need to distinguish between the case where the 
acceptance probability is at least 2/3, and the case where it is at most 1/3. We do 
so by approximating the acceptance probability to within 1/6 in a deterministic 
way within the space and time bounds specified in the tradeoff. The problem 
reduces to matrix powering in the following standard fashion. 

Consider the configurations of the machine on input w, where each configu- 
ration consists of the state of the machine, position of the tape heads and the 
contents of work tape. Let the total number of configurations be d. We can 
label them l,2,...,<i. As the machine runs in logspace and polynomial time, 
d = We can assume that the machine reads a fresh random bit b in ev- 

ery step of the computation. The machine then behaves like a (d, l)-automaton. 
More formally, the function Q that maps (f, b) to the state of the machine after 
reading random bit b in state i, defines a (d, l)-automaton. Note that Q as well 
as its transition matrix M = J^{Q) can be computed in logspace given w. With- 
out loss of generality, the initial configuration of the machine is labeled 1, and 
the machine has a unique accepting configuration labeled d, which is absorbing 
(i.e., M[d, d] = 1). Suppose the machine runs for p steps, where p is polynomial 
in n. By (1), the acceptance probability of the machine on input w is given 
by the (l,d) entry of the matrix M^, which is the same as the (l,d) entry of 
for any integer r > logp. In Section 4, we sketch a proof of the following 
approximation result. 

Theorem 2. Given a {d,l)-matrix M, positive integers r, ri and T 2 , where 
r = riT 2 = O(logd), and a positive integer a = O(logd), there is a deter- 
ministic algorithm that computes a matrix M' such that ||M^ — M'\\ < 2~“. 
The algorithm uses space 0(max{ri, r 2 } log d) and runs in time cjO(min{ri,r 2 }) ^ 

Setting ri = 0(log°’®^“d), r 2 = 6>(log°'^~“d), and a = 3 in Theorem 2 yields 
Theorem 1. 

Hence, given a (d, l)-matrix M and an integer r = O(logd), our goal is to 
approximate within certain time and space bounds. Instead of working with 
(d, l)-matrices, we consider the problem for (d, m)-matrices. Clearly, any (d, 1)- 
matrix is also a (d, m)-matrix for any m > 1. A specific value for m = O(logd) 
will be fixed later. 



3.1 Nisan’s Pseudorandom Generator Construction 

A standard way of computing (either exact or approximately) is by repeated 
squaring. A recursive implementation as in Savitch’s Theorem yields a procedure 
that runs in 6>(log^ d) space and time: Each of the 0(logd) levels of 

recursion induces an additive term of 6?(log d) to the space complexity, and a 
multiplicative term of d®^^^ to the time complexity for storing and (re)computing 
intermediate results. 
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Nisan’s breakthrough [5,6] can be viewed as repeated approximate squaring 
in a way that avoids the need for storing and recomputing intermediate results. 
The basic ingredient is the following procedure to approximately “square” an 
underlying automaton using a hash function h. 

Definition 5. Let Q be a {d,m) -automaton, h : {0, 1}™ — >■ {0, 1}™, and e > 0. 
Define Qh, also denoted by Mh{Q), to be the following (d,m) -automaton: For 
any i G {0, 1, . . . ,d} and a G {0, 1}™, Qhif cf) = Q{i, a o hfa)), where o denotes 
concatenation. We denote h as e-pseudorandom for Q if\\M.{Qh)—M.{Q)‘^\\ < e. 

By (1), Ai{Q)^ corresponds to picking a,/3 € {0, 1}"* independently at random 
and running Q on a o /3. Similarly, A4{Qh) corresponds to picking only a G 
{0, 1}™ uniformly at random, and running Q on a o h{a). This motivates the 
use of the term “pseudorandom.” 

A crucial question is how to find the e-pseudorandom hash functions h we 
need. Nisan argues that e-pseudorandom hash functions for a (d, m) automaton 
abound in a universal family of hash functions T-Lm from {0, 1}™ to {0, 1}™ with 
m = l7(log d-|-log 1/e). We will fix a universal family such that we can uniformly 
sample from "Hm in 0{m) space, as well as evaluate any h G T-Lm on a given input. 
An example of such a family is the set of all linear functions over a finite field 
with 2™ elements. 

Lemma 1. Let Q be a (d,m) -automaton, Hm o, universal family of hash func- 
tions, and e > 0. 

df 

[h is not e-pseudorandom for Q] < . 

Moreover, given Q and e, an exhaustive search for a hash function h G Hm that 
is e-pseudorandom for Q can be performed in space 0{m -\- logd). 

Using Lemma 1, we can approximately square a (d, m)-matrix M as follows: 
Construct the canonical (d, m)-automaton Q = Q{M), find a hash function 
h G Hm that is e-pseudorandom for Q, construct Qh, and transform Qh into its 
transition matrix M{Qh). The result is a (d, m)-matrix that is no more than e 
away from in norm. When we apply this procedure r times successively to 
approximate , we can, of course, eliminate the intermediate automaton-to- 
matrix and matrix-to-automaton transformations. We introduce the following 
simplified notation for the intermediate automata. 

Definition 6. LetQ be a {d,m)- automaton and h\,h 2 , ■■■ ,hr be hash functions 
over {0,1}'". Then we define Qhi,h 2 ,...,hk^ ^^so denoted by Afhi,h 2 ,...,hkiQ)> 

Qhi,h2,...,hk = (■ • ■ ((Qhi)h2) ■ ■ ■)hk- 

Another important question is how the errors accumulate. The accuracy of 
the final result is governed by Lemma 2, which uses the following definition [5]. 

Definition 7. Let Q be a (d,m) -automaton and h\,h 2 , ■ ■ ■ ,hr be hash functions 
over (0, 1}"*. We say that the sequence hi, / 12 , ■ ■ ■ ,hr is e-pseudorandom for Q, 

if\\M{Qh,,h2,...,hQ-{Mm^^\\<e- 
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Lemma 2. Let Q he a {d,m) -automaton, and e > 0. Let hi,h 2 , ■ ■ ■ ,hr he hash 
functions over {0, 1}™ such that, for each 1 < i < r , hi is e-pseudorandom for 
Qhi,h 2 ,...,hi-i ■ Then the sequence hi, h 2 , ■ . . ,hr is e-pseudorandom for Q. 

It follows from Lemma 1 and 2 that a choice of m = 0(log(i) is sufficient to 
obtain a good enough approximation to for our purposes. 

A straightforward recursive implementation of Nisan’s algorithm uses roughly 
the same amount of resources as the trivial recursive squaring approach, namely 
Oflog^ d) space and time. However, the crucial property in Proposition 

1 below allowed Nisan to do better. The following definition plays a central role. 



Definition 8. Let Q he a {d,m) automaton, and h\,h 2 , ■ ■ ■ ,hr he hash func- 
tions over {0, 1}™. Recursively define ® mapping from {0, 1}™ to 

({0, l}™)^ hy G{a) = a, and 

Note that, given hash functions hi,h 2 , ■ ■ ■ ,hr € Hm, a bit index, and a € 
{0, 1}™, any bit of Ghi,h 2 ,...,hr{c() can be computed in 0(m + logr) space. 

Proposition 1. Let Q he a {d, m)-automaton and hi, h 2 , . . . ,hr he hash func- 
tions over {0, 1}™. Then for any i G {0, 1, . . . ,d} and a G {0, 1}™, 

Q h\,h2,...,hr-i.f Qifl ^ h\,h2,...,hri.^^^ • 

Although Proposition 1 looks innocuous, it has far reaching consequences. First, 
it shows that we obtain our approximations to the probabilities defined by (1) for 
A: = 2’’ by subsampling, i.e., by using a pseudorandom generator. The function 
mapping {hi,h 2 , ■ ■ ■ ,hr,a) to Ghi,h 2 ,...,hr(o:) defines a pseudorandom genera- 
tor stretching 0(rm) random bits to 2’’m pseudorandom bits that “fool” the 
automaton Q. We will not explicitly use this fact but it was Nisan’s original 
motivation [5]. 

Second, and more important for us. Proposition 1 provides a shortcut for 
storing the intermediate results in the recursive computation of Qhi,h 2 ,...,K- 
Given hi,h 2 ,...,h^ G Tim, * G {0: Ij ■ • ■ : c?}, and a G {0,1}'", we can now 
compute Qhi,h 2 ,...,hrih «) in space 0{m -\- r -\- logd) instead of 0{r{m -\- logd)). 
This allows us to deterministically find pseudorandom hash functions hi for every 
level of recursion as well as compute M^{Qhi,h 2 ,...,hr) from hi, / 12 , ■ ■ ■ ,hr using 
only 0(m -I- r -I- log d) work space, and therefore in time The only 

additional space needed is 0{rm) bits to store the hash functions. This is how 
Nisan established his result that BPL C DTISP[n o(D,0(log2n)] [6]. 

3.2 Saks-Zhou Rounding 

The only reason why Nisan’s algorithm needs more than logarithmic space is to 
store the r = O(logd) hash functions of 0(m) = O(logd) bits each. A natural 
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approach to save space then, is to use the same hash functions at multiple levels 
of recursion. 

Given any fixed collection of (d, m)-automata Qi,Q 2 , ■ ■ ■ ,Qr, we can effi- 
ciently find a hash function h that is pseudorandom for all of them. This follows 
from Lemma 1. 

Corollary 1. For any {d,m)- automata Qi,Q 2 , ■ ■ ■ ,Qr, 



[h is not e-pseudorandom for at least one of Q\, Q 2 , ■ • ■ , Qr] < “ 22 m ’ 

Moreover, given Qi, Q 2 , • ■ • , and e, an exhaustive search for a hash function 
h G Hm that is e-pseudorandom for all of Qi,Q 2 , ■ ■ ■ , Qr can be performed in 
space 0{m -\- log d -\- log r) . 

However, the sequence Qi,Q 2 , ■ ■ ■ ,Qr in Nisan’s algorithm is not fixed but de- 
pends on the choice of the hash functions: Qi+i = {Qi)hi- If Lemma 1 guaranteed 
the existence of a fixed automaton Q* such that Qh = Q* for most choices of h, 
then we could apply Corollary 1 to the sequence Qi,Q 2 , ■ ■ ■ ,Qr with Qi = Q 
and Qi+i = Q*, and safely use the same hash function h at every level of recur- 
sion. Unfortunately, Lemma 1 only tells us that Qh is “close” to some fixed Q* 
for most h and not that it coincides with Q* for most h. Similarly, Corollary 1 
does not guarantee the existence of a hash function h that is pseudorandom for 
all of Q, Qh, {Qh)h, etc. 

Saks and Zhou [9] devised a randomized rounding scheme to make the au- 
tomata at every level of recursion independent of the hash functions at the 
previous levels (for most choices of these hash functions), namely by randomly 
perturbing and then truncating the entries of the transition matrices. We will 
review the details below and see that the rounding requires O(logd) bits to be 
stored for a, d x d matrix. 

We point out one unfortunate side-effect of the rounding scheme, which 
wasn’t critical for Saks and Zhou but will be for us: the breakdown of the ana- 
logue of Proposition 1 . We can no longer circumvent the recursion because we 
need to act on the matrices, and therefore have to transform between automata 
and matrices at every level of recursion. As a consequence, the processing space 
increases from 0{m -b r -|- log d) to 0(r(m -b log d)) again. 

The storage space consists of two components: 0{m) bits for the hash func- 
tion h, and 0{r log d) bits for all the perturbations. In order to balance the 
two components of the space complexity, Saks and Zhou considered a hy- 
brid algorithm that uses r\ hash functions h = (hi, h 2 , ■ ■ ■ , hrQ cyclically r 2 
times, where rir 2 = r, and applies the randomized rounding between cy- 
cles. The processing space becomes 0{r2{m -b logd)) and the storage space 
0{rim -b r 2 logd). By setting ri = r 2 = \/r, Saks and Zhou obtained their 
result that BPL C DTISP[nO('°s“ ""), 0(log^'®n)] [9]. 

We now discuss the Saks-Zhou rounding scheme in more detail for the hybrid 
setting, and adapt it to our needs. 
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Definition 9. Let t,D be positive integers, and S € {0,1}'°. We define the 
rounding operator TZs^t the function mapping z G [0, 1] to [max(z — 52“*,0)Jt, 
where S denotes the number with binary representation .6, and [zjt = [2‘zj2“*. 
The operator extends to real-valued matrices with each entry in [0, 1] by entry- 
wise application. 

The effect of TZs^t is to perturb the binary expansion of every number after the 
t-th bit by 5 (obtaining a smaller number but never going negative), and then 
truncate it after the t-th bit. Note that if M is a {d, m)-matrix and t <m, then 
T^s,t{M) also is a (d, m)-matrix. 

Definition 10. Let M be a d x d-matrix over [0,1], and t,D,r\,k be positive 
integers. Let S = {61,62, ... ,Sk) G ({0,1}°)^. We define the rounded sequence 
R(M, d) with parameters t, r\, and k as the following sequence of matrices: 

No,Ni,N^,N2,N2,...,Nk,Nk 

where, Nq = M and for 1 < i < k, 

Ni = NZ\, N, = TZs„t{Ni). 

Note that if M is a (d, m)-matrix and t < m, then every matrix Ni is a (d, m)- 
matrix. Their canonical automata Q{Ni) are the “fixed” automata to "which 
Corollary 1 will be applied. 

Rounding does not effect the final result by much. 

Lemma 3. Let M be a d x d-matrix over [0, 1], and t, D,r\,k be positive inte- 
gers. Let d = {61,62, ... ,6k) G ({0,1}"°)^. Then, the matrix defined in the 
rounded sequence R(M, S) with parameters t, ri, and k satisfies — ^ || < 

2~ i+Tl fc + log d+1 



The crucial property of the rounding operator is that for any z G [0, 1], unless 
6 is exceptional, TZs^t maps any number that is about 2“(‘+'°i close to z to the 
same value as z. Saks and Zhou formalized this property using the notion of 
safety. We will distinguish between two levels of safety. We define them right 
away for matrices. 

Definition 11. Let M be a d x d-matrix over [0,1], t,D positive integers, and 
6 G (0, 1}°. We denote 6 as t-safe for M if for any dx d-matrix Z over [0, 1] sat- 
isfying ||Z — Mjj < 2“^*+°^, TZs,t{Z) = TZs,t{M). Lfthe condition holds whenever 
Z satisfies ||Z — Mjj < 2“0+-°+^), we call 6 t-weakly-safe for M. 

By definition, t-safeness implies t-weakly-safeness. We will use the following 
stronger relationship between the two levels of safety. 

Lemma 4. Let M and N be d x d-matrices over [0,1], t,D positive integers, 
and 6 G {0,1}°. Lf 6 is t-safe for M and \\M — iVjj < 2“0+£'+i)^ § ig 

t-weakly-safe for N. 
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For any z G [0, 1], there are at most two values of 5 G {0, 1}^ that are not t-safe 
for z (viewed as a 1 x 1 matrix), and these values are easy to find. We have the 
following generalization for d x d matrices. 

Lemma 5. Let M be a d x d-matrix over [0, 1], and t,D positive integers. For 
all but at most 2d? values of S G {0,1}^, the operator TZs,t is t-safe for M. 
Moreover, given M, t, and D, an exhaustive search for a <5 G {0, 1}^ that is 
t-safe for M can be performed in space 0{D log d log t) . 

Definition 12. Let M be a {d,m)-matrix, and t,D,ri,k positive integers with 
t < m. Let 5 = (5i, <52, ■ • ■ <5fe) G ({0,1}^)^. Let h = {hi,h 2 , ■ ■ ■ ,hr^) be 
a sequence of hash functions from Hm- We define the Saks-Zhou sequence 
SZ{M,h,S) with parameters t, ri, and k as the following sequence of matrices 
and automata: 

Mo, Qo,Qi, Mi,Mi,Qi,Q 2 , M 2 , M 2 , ■ ■ ■ , Qr 2 -i,Qk, Mk, Mk 
where, Mq = M, and for 1 < i < k, 

Qi-i = Q{Mi-i), Qi = (Qi-i), Mi=M{Qi), Mi — TZsi,t{Mi) 

See Figure 1 for a schema of the construction of the Saks-Zhou sequence 
SZ(M, h,S), the rounded sequence R(M, d), and their intended relationships (for 
k = V 2 ). In the figure, the operator V denotes raising a matrix to the power 2’’^. 

Saks and Zhou defined S to be t-safe for the sequence R(M, 6) if Si is t-safe 
for Ni, for each i. As can be seen from Lemma 5, for a large enough choice of 
D = 0(logd), a random d will be safe for R(M, J). When <5 is safe for R(M, S), 
one can argue that for most h. Mi = Ni for 0 < i < r 2 , where Mi and Ni 
are matrices defined in SZ{M, h, d) and K{M,d), respectively. In particular, 
Mr^ = Nr 2 is a good approximation to . 

The fact that Mi and Ni coincide for every 1 < z < r 2 can be seen as follows. 
Let Po,P\, . . . ,Pt 2 be the canonical automata corresponding to the matrices 
No,Nx, . . . , Nr 2 , respectively. If we choose h at random from = 'Hm x Hm x 
• • • X Hm, then, with high probability, h is e-pseudorandom for all the automata 
Po,P\,. ■ ■ , Pt2, for suitably small e. This is guaranteed by Corollary 1 because 
these automata are defined using only M and d, and in particular they do not 
depend on h. By definition, Mq = Nq = M and thus Qo = Pq. Therefore h is 
e-pseudorandom for Qo- It follows that Mi is close to Ni. Then Mi = Ni by i5i 
being safe for Ni. Continuing this way for C 2 steps, we get Mr^ = Nr^- 

Using these properties, the Saks-Zhou [9] algorithm approximates M^ as 
the average of the matrices M^j over all S and h. 

4 The Tradeoff 

In this Section, we sketch the proof of Theorem 2. We refer the reader to a longer 
version of the paper [3] for more details. 
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Given a {d, m)-matrix M and integers r\, T 2 with r = rir 2 , the Saks-Zhou al- 
gorithm described in Section 3.2 approximates in space 0(max{ri, r 2 } log d) 
but uses time dO(’'i+’’ 2 ). The factor comes from trying all possible se- 

quences of ri hash functions. Our goal is to avoid this exhaustive search and 
approximate using space 0(max{ri, r 2 } log d) and time dO{nnn{ri,r 2 }) ^ 
will choose r\ > T 2 , and thus, we want to approximate in space 0(ri logd) 
and time 

For any sequence 5 = (di, d 2 , . . . , Sr 2 ) G ({0, 1}^)'’^, the matrix Nr^ defined 
in the rounded sequence R(M, d) is a close approximation to . We want 
to find a d and a sequence of hash functions h = (di, / 12 , . . . , such that 
we can efficiently compute the matrices Nq, N\, . . . , Nr^ defined in the sequence 
R(M, d) by computing the corresponding matrices Mq, Mi, . . . , Mr^ in the Saks- 
Zhou sequence SZ{M,h,S). See Figure 1 for the correspondences. 



7? .i -p ^ 

^ A-i N-2 ■ 



■ AV, 



I 

A/o 



A/, — ^ Ad 



.\h Ab 



Qo 



J^'h 



Q2 Qi • 



Q I 



Qr-i-X >■ Qrj 



A/,., Mr, 



Fig. 1. Rounded and Saks-Zhou sequences 



A natural idea is to find the hash functions hi, one by one, based on Lemma 1. 
Thus, for the automaton Qo = Q{M), we can find a sequence of hash functions, 
h = (di, d- 2 , . . . , dri), one by one, such that Mi = M{Qi) = M{Afh{Qo)) is very 
close to Ni = ^ . This step is essentially Nisan’s algorithm for raising M 

to the power 2’’L Each step to find the next hi is carried out by an exhaustive 
search over T-Lm, and thus can be done in polynomial time and space 0{rim). 

Then we can deterministically find a di G {0, 1}^ which is safe for Mi. By 
Lemma 5, we can guarantee the existence of di by setting D = O(logd). Since 
Ml is very close to Ad, di will in fact be weakly-safe for A^i, even though we 
never really compute Ni exactly. With h already written, the task of finding di 
is in logspace (and thus in polynomial time). Since Mi and Ni are close, the 
rounding operator TZs^^t on Mi and Ni yields identical matrices Mi = Ni. At 
this point, we have computed h = and di, which can be viewed as a succinct 
representation of , which is an approximation to M^ ^ . 

Now we would like to extend this to A^ 2 - Let Qi = Q{Mi) = Q{Ni). By 
Corollary 1, it is true that most sequences of ri hash functions h' will be simul- 
taneously pseudorandom for Qo and Qi. If we had chosen our h randomly, then 
with high probability it would be pseudorandom for both Qo and Qi. But we 
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found our h deterministically; there is no guarantee that h will be pseudorandom 
for Qi. 

Our idea is to find a new sequence of hash functions h' . With the succinct 
representation h and <5i in hand, we can compute Qi and find a new sequence 
of ri hash functions h' which is pseudorandom for both Qq and Qi. Of course, 
due to the space bound we cannot keep the old h around. So, at this point we 
want to discard h and replace it by h' . 

However, after discardii^ the old h, we no longer have access to Mi = 
and it is from Mi and <5i that we defined Qi, for which the new h' 
was found to be pseudorandom. If at this point the new M[ = were 

to lead to an automaton different from Qi, we would have made no progress. 

However, note that is weakly-safe for Ni. While M[ = M{Afh'{Qo)) may 
be different from Mi = M{Nh{Qo)), both matrices are very close. In fact, the 
new M'l = will be close to Ni. It follows that when we apply the 

rounding operator TZsi,t, the matrices are the same again, namely Mi = Ni. See 
the left-hand side of Figure 2. Therefore, we do get back to the same automaton 
Qi = Q{Mi) = Q{Ni). 



<?*-- — > Qi .Ut 




Fig. 2. Finding new sequence of hash functions 



We then proceed to find a 82 which is weakly-safe for N2 = . Note that 

N2 is never computed exactly but we choose 82 as being safe for the matrix 
M 2 = which is very close to the matrix N 2 - Then applying 

to either matrix yields N2- We have arrived at a succinct representation for Ni 
and N 2 , namely h' = and (<5i,52). 

Generalizing, we accomplish the computation of our final approximation 
Nr^ to M^ in T 2 stages as follows (see Figure 2). At the end of stage k, 
we will have computed a sequence of ri hash functions h and a sequence 
S = { 81 , 82 , ■■■ , 8 k) G ({0, 1}^)^. They constitute a succinct representation of the 
matrices Nq, Ni, . . . , Nk of the rounded sequence; we can compute these matri- 
ces Ni as the corresponding matrices Mi in the Saks-Zhou sequence SZ((5, h, S). 
Moreover, each 8i in S is weakly-safe for Ni in R(M, S), I < i < k. 

In the {k + I)st stage, we use h and 5 to reconstruct the matrices Nq = Mg, 
A^i = Ml, ..., Nk = Mk- We then compute a new sequence of ri hash 
functions h', one by one, which is simultaneously pseudorandom for Qq = 
Q{No),Qi = Q{Ni), . . . ,Qk = Q{Nk). Thus, while both h and h' are pseu- 
dorandom for Qo,Qi, . . . ,Qk-i, h' is also pseudorandom for Qk- Now each 
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M[ = are suffi- 

ciently close to Ni, N 2 , ■ ■ . , Nk, respectively, that when we apply the rounding 
operator TZ based on 6 and t, we get the matrices Ni, ... ,Nk of the rounded 
sequence K{M,d) back. This is because 6i is weakly safe for Ni, 1 < i < k. 
In addition, we have obtained = M{Afh'{Qk)), which is close to 

We then find a 6k+i G {0, which is guaranteed to be weakly-safe for the 
matrix Nk+i. Again, even though we do not really compute this A^fc+i exactly, 
we find as being safe for its close approximation At this point, h' 

and 8' = (i5i, (52, • . . , 8k, <5fc+i) constitute a succinct representation of the matri- 
ces Nq, iVi, . . . , Nk, Nk+i in R(AT, S'), as shown in Figure 2. Now, we discard the 
old h and replace it with the newly found h' , and we append to 8 . 

At the end of T 2 stages, we have found h and 8 = (5i, ^ 2 , . . . , 5ra) such that 
the rounded sequence R(M, and the Saks-Zhou sequence SZ(M, h, S) agree as 
in Figure 1. 

In terms of time and space complexity, the most time consuming task is to 
recursively reconstruct the automata and matrices in the Saks-Zhou sequences 
with a maximum of V 2 levels of recursion, each level involving a polynomial time 
computation. The most space consuming task is to store the ri functions h. 

5 Further Research 

In this paper, we established a tradeoff between Nisan’s result that BPL C 
DTISP[n°(i), O(log^n)], and the Saks-Zhou result that BPL C DTISP[n°(i°8“'"”), 
O(log^’^n)]. It would be nice to get the best of both worlds at the same time, 
i.e., a deterministic simulation of BPL that simultaneously runs in polynomial 
time and space O(log^'^n). The breakdown of Proposition 1 seems to be the 
bottleneck. 

Regarding undirected st-connectivity, Armoni et al. [2] managed to reduce 
the space bound to O(log^^^n). At a high level, they apply the Saks-Zhou round- 
ing technique to the shrinking strategy of Nisan, Szemeredi, and Wigderson [7]. 
Does the 0(log‘*^^n) space algorithm lead to a better time-space tradeoff for 
undirected st-connectivity than our general result for BPL? 

The big open question, of course, is whether BPL can be simulated deter- 
ministically in logarithmic space. We know that this is the case if there exists a 
language in deterministic linear space that requires branching programs of size 
2'^" for some positive constant e [4]. 
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Abstract. We investigate the question of whether one can character- 
ize complexity classes (such as PSPACE or NEXP) in terms of effi- 
cient reducibility to the set of Kolmogorov-random strings i?K • We show 
that this question cannot be posed without explicitly dealing with issues 
raised by the choice of universal machine in the definition of Kolmogorov 
complexity. Among other results, we show that although for every uni- 
versal machine U, there are very complex sets that are <jjj-reducible to 
RK ^ , it is nonetheless true that P = REG nflc/M : A<p,,Rk^}. We 
also show for a broad class of reductions that the sets reducible to Rk 
have small circuit complexity. 



1 Introduction 

The set of random strings is one of the most important notions in Kolmogorov 
complexity theory. A string is random if K(x) > |a;|. (Given a Turing machine 
U, K[/(x) is defined to be the minimum length of a “description” d such that 
U (d) = X. As usual, we fix one such “universal” machine U and define K(x) to 
be equal to K[/(x). In most applications, it does not make much difference which 
“universal” machine U is picked; it suffices that U satisfies the property that for 
all U' there exists a constant c such that K(/(a;) < K[/'(x) + c.) Let i?x denote 
the set of random strings, and let Rku denote the corresponding set when we 
need to be specific about the particular choice of machine U . 

It has been known since [4] that i?x is co-r.e. and is complete under weak- 
truth-table reductions. This was improved significantly by Kummer, who showed 
that i?K is complete under truth-table reductions [3] (even under disjunctive 
truth-table reductions (dtt-reductions)). Thus there is a computable time bound 
t and a function / computable in time t such that, for every x, f(x) is a list of 
strings with the property that f{x) contains an element of if and only if x is 
not in the halting problem. Kummer’s argument in [3] is not very specific about 
the time bound t. Can this reduction be performed in exponential time? Or in 
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visiting CWI, Amsterdam and while a graduate student at Rutgers University, NJ. 
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doubly-exponential time?^ In this paper, we provide an answer to this question; 
surprisingly, it is neither “yes” nor “no” . 

Kummer’s theorem is not primarily a theorem about complexity, but about 
computability. More recently, however, attention was drawn to the question of 
what can be efficiently reduced to Rk- Using derandomization techniques, it was 
shown in [1] that every r.e. set is reducible to Rk via reductions computable by 
polynomial-size circuits. This leads to the question of what can be reduced to 
Rk by polynomial-time machines. In partial answer to this question, it was also 
shown in [1] that PSPACE is contained in P^^ . 

Question: Is it possible to characterize PSPACE in terms of efficient reduc- 
tions to i?K? 

Our goal throughout this paper is to try to answer this question. We present 
a concrete hypothesis later in the paper. Before presenting the hypothesis, how- 
ever, it is useful to present some of our work that relates to Kummer’s theorem, 
because it highlights the importance of being very precise about what we mean 
by “the Kolmogorov random strings” . 

Our first theorem suggests that Kummer’s reduction might be computable 
in doubly-exponential time. 

Theorem 1. There exists a universal Turing machine U such that {0^ ; x is 

not in the Halting problem} is polynomial-time reducible to Rku (and in fact 
this reduction is even a reduction) . 

Note that, except for the dependence on the choice of universal machine 
U, this is a considerable strengthening of the result of [3], since it yields a 
polynomial-time reduction (starting with a very sparse encoding of the halting 
problem). In addition, the proof is much simpler. 

However, the preceding theorem is unsatisfying in many respects. The most 
annoying aspect of this result is that it relies on the construction of a fairly 
“weird” universal Turing machine U. Is this necessary, or does it hold for every 
universal machine? Note that one of the strengths of Kolmogorov complexity 
theory has always been that the theory is essentially insensitive to the particu- 
lar choice of universal machine. We show that for this question (as well as for 
other questions regarding the set of Kolmogorov-random strings) the choice of 
universal machine does matter. 



1.1 Universal Machines Matter 

To illustrate how the choice of universal machine matters, let us present a corol- 
lary of our Theorem 8. 

^ Kummer does show in [3] that completeness under truth-table reductions does not 
hold under some choices of numberings of the r.e. sets; however his results do hold 
for every choice of a universal Turing machine (i.e., “Kolmogorov” numberings, or 
“optimal Godelnumberings” ) . Kummer’s result holds even under a larger class of 
numberings known as “optimal numberings”. For background, see [5]. 
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Corollary 1. Let t be any computable time bound. There exists a universal Tur- 
ing machine U and a decidable set A such that A is not dtt reducible to Rku 
time t. 

Thus, in particular, the reason why Kummer was not specific about the 
running time of his truth-table reduction in [3] is that no such time bound can be 
stated, without being specific about the choice of universal Turing machine. This 
stands in stark contrast to the result of [1], showing that the halting problem 
is P/poly-reducible to i?K; the size of that reduction does not depend on the 
universal Turing machine that is used to define i?K- 

Most notions in complexity theory (and even in computability theory) are in- 
variant under polynomial-time isomorphisms. For instance, using the techniques 
of [2] it is easy to show that for any reasonable universal Turing machines U\ 
and U 2 , the corresponding halting problems Hi = {{x,y) : Ui{x,y) halts} are 
p-isomorphic. However, it follows immediately from Corollary 1 and Theorem 1 
that the corresponding sets of random strings Rku ^re not all p-isomorphic. 

Corollary 2. Let t be any computable time bound. There exist universal Turing 
machines U\ and U 2 such that Rku^ reducible to Rku^ time t. Ln 

particular, Rku^ isomorphic via isomorphisms computable in 

time t. 

(We believe that the situation is actually even worse than this, in that the 
quantifiers in the preceding corollary can be switched. Even if we take U\ to be 
the “standard” universal machine, and we define U2{0d) = U\{d), we do not see 
how to construct a computable isomorphism between and Ryl^,^.) 

The lesson we bring away from the preceding discussion is that the choice 
of universal machine is important, in any investigation of the question of what 
can be efficiently reduced to the random strings. In contrast, all of the results of 
[1] (showing hardness of i?x) hold no matter which universal Turing machine is 
used to define Kolmogorov complexity. 

Another obstacle that seems to block the way to any straightforward charac- 
terization of complexity classes in terms of Rk is the fact that, for every universal 
Turing machine and every computable time bound t, there is a recursive set A 
such that but such that A ^ DSPACE(t) (Theorem 10). Thus P'^'^ 

may not correspond to any reasonable complexity class. How can we proceed 
from here? 

We offer the following hypothesis, as a way of “factoring out” the effects of 
pathological machines. In essence, we are asking what can be reduced to the 
K-random strings, regardless of the universal machine that is used. 

Hypothesis 2 PSPACE = REC n fit/ • 

We are unable to establish this hypothesis (and indeed, we stop short from 
calling it a “conjecture”). However, we do prove an analogous statement for 
polynomial-time dtt reductions. 

Motivation for studying dtt reductions comes from Kummer’s paper [3] (pre- 
senting a dtt reduction from the complement of the halting problem to Rk), 
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as well as from Theorem 1 and Corollary 1. The following theorem is similar 
in structure to Hypothesis 2, indicating that it is possible to “factor out” the 
choice of universal machine in some instances. 

Theorem 3. P = REC fl C\u{^ ■ dtt ^KuI- 

We take this as weak evidence that something similar to Hypothesis 2 might 
be true, in the sense that it shows that “factoring out” the effects of universal ma- 
chines can lead to characterizations of complexity classes in terms of reducibility 
to the random strings. 

1.2 Approaching the Hypothesis 

In order to prove Hypothesis 2, one must be able to show that there are decid- 
able sets that cannot be reduced efficiently to Rk^ for some U. Currently we are 
able to do this only for some restricted classes of polynomial-time truth-table re- 
ductions: (a) monotone truth-table reductions, (b) parity truth-table reductions, 
and (c) truth-table reductions that ask at most n“ queries, for a < 1. 

In certain instances, we are able to prove a stronger property. In the case of 
parity truth-table reductions and disjunctive reductions, if there is a reduction 
computable in time t from A to Rylu for every U, then A can already be computed 
nearly in time t. That is, for these classes of reducibilities, a reduction to Rk 
that does not take specific properties of the universal machine into account is 
nearly useless. We believe that this is likely to be true for any polynomial-time 
truth-table reduction. Note that this stands in stark contrast to polynomial-time 
Turing reducibility, since PSPACE-complete problems are expected to require 
exponential time, but can be solved in polynomial time with i?K as an oracle. An 
even stronger contrast is provided by NP-Turing reducibilities. The techniques 
of [1] can be used to show that NEXP C NP^"^; and thus Rk provably provides 
an exponential speed-up in this setting. 

2 Preliminaries and Definitions 

In this section we present some necessary definitions. Many of our theorems make 
reference to “universal” Turing machines. Rather than give a formal definition of 
what a universal Turing machine is, which might require introducing unnecessary 
complications in our proofs, we will leave the notion of a “universal” Turing 
machine as an intuitive notion, and instead use the following properties that are 
widely known to hold for any natural notion of universal Turing machine, and 
which are also easily seen to hold for the universal Turing machines that we 
present here: 

— For any two universal Turing machines U\ and U 2 , the halting problems 
for Ui and U 2 are p-isomorphic. That is, Ui halts on input x if and only 
if U 2 halts on input x' (where x' encodes the information (U\^x) in a 
straightforward way). This is a length-increasing and invertible reduction; 
p-isomorphism now follows by [2]. 
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— For any two universal Turing machines U\ and U2, there exists a constant c 

such that (x) < (x) + c. 

Let Ui be the “standard” universal Turing machine. If U2 is any other machine 
that satisfies the two properties listed above, then we will consider U2 to be a 
universal Turing machine. We are confident that our results carry over to other, 
more stringent definitions of “universal” Turing machine that one might define. 
This does not seem to us to be an interesting direction to pursue. 

We define Rku = {x G { 0, 1}* : Kjj{x) > |x|}. When we state a result that 
is independent of a particular choice of a universal Turing machine U we will 
drop the U in Kjj and refer simply to K(x). 

2.1 Reductions 

Let 7^ be a complexity class and A and B be languages. We define the following 
types of reductions. 

— Many-one reductions. We say that A TZ-many-one reduces to B {A B) 
if there is a function f G TZ such that for any x G S* , x G A ii and only if 
/(x) G B. 

— Truth-table reductions. We say that A TZ-truth-tahle reduces to B (A B) 
if there is a pair of functions q and r, both in TZ, such that on an input 
X G E*, function q produces a list of queries 91 , 92 , • • • , <?m so that for 
ai,a2,...,am G {0,1} where Oj = B{qi), it holds that x G Gl if and only 
if r((x,(gi,ai), (92,02), •• -,( 9771, a^))) = 1. 

If r = AiOj, then the reduction is called a conjunctive truth-table reduction 
(<rtt)- If ?■ = ViOi, then the reduction is called a disjunctive truth-table 
reduction (<^j). If the function r computes the parity of oi, 02, . . . , Om, 
then the reduction is called a parity truth-table reduction (<0tt)- If Hi® 
function r is monotone with respect to Oi, 02, . . . , Otti then the reduction 
is called a monotone truth-table reduction (<^tt)- function r is monotone 
with respect to ai, . . . , Om, if for any input x, any set of queries 91, ... , 9^, 
and oi, . . . , Om, a'l, . . . , G {0,1}, where for all i, at < a', if r accepts 
(x, (91, oi), (92, 02), • • • , (9771, Om)) then it is also the case that r accepts 
the tuple (x, (91, a}), (92, 02), • • • , (9777, aj,^))) If the number of queries m is 
bounded by a constant, then the reduction is called a bounded truth-table 
reduction (<^^). If the number of queries m is bounded by /(n), then the 
reduction is called a /(n) truth-table reduction 

~ Turing reductions. We say that A TZ-Turing reduces to B {A B) if there 
is an oracle Turing machine in class TZ that accepts A when given i? as an 
oracle. 

3 Inside 

We have two kinds of results to present in this section. First we present several 
theorems that do not depend on the choice of universal machine. Then we present 
our results that highlight the effect of choosing certain universal machines. 
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3.1 Inclusions That Hold for All Universal Machines 

The following is a strengthened version of claims that were stated without proof 
in [1], 

Theorem 4. 

1. {A G REC : A<PjRk} C P. 

2. {A G REC : A Pk} C P. 

3. {A G REC : ^k} C P/poly. 

Proof. In all three arguments we will have a recursive set A that is 
ducible to Rk, where {q,r) is the pair of polynomial-time-computable functions 
defining the <ctt, <P^t and reductions, respectively. For x G {0,1}*, Q{x) 
will denote the set of queries produced by q on input x. 

1. (q,r) computes a <p^j reduction. For any x G A, Q{x) C R^. Hence, 
Q = UaeA Qi^) i® P-®- subset of Rk- Since Rk is immune (i.e., has no infinite 
r.e. subset), Q is finite. Hence we can hard-wire Q into a table and conclude that 
AgP. 

2. (< 7 ,r) computes a reduction. We will prove the claim by induction on 
the number of queries. If the reduction does not ask any query, the claim is trivial. 
Assume that the claim is true for reductions asking fewer than k queries. We will 
prove the claim for reductions asking at most k queries. Take {q, r) that computes 
a <P^t reduction and such that |Q(a:)| < k, for all x. For any string x, let rux = 
min{|< 7 | : q G Q(x)}. We claim that there exists an integer I such that for any x, 
if rrix > I and Q{x) = {qi,q 2 , . . . ,qk’} then r((x, ( 91 , 0), ( 92 , 0), . . . , (gfe/, 0))) = 
A{x). For contradiction assume that for any integer I, there exists x such that 
nix > I and r((a;, (gi, 0), (( 72 , 0), . . . , (g^/, 0))) yf A{x). Since A is recursive, for 
any I, we can find the lexicographically first xi having such a property. All the 
queries in Q{xi) are longer than I and at least one of them should be in Rk- 
However, each of the queries can be described by 0(log?) bits, which is the 
contradiction. Hence, there exists an integer I such that for any x, if rUx > I 
then r{{x, (gi, 0), {q 2 , 0), . . . , (%', 0))) = A{x). Thus we can encode the answers 
for all queries of length at most I into a table and reduce the number of queries 
in our reduction by one. Then we can apply the induction hypothesis. 

3. (q,r) computes a reduction, q is computable in time n°, for some 

c > 1. We claim that r does not depend on any query of length more than 
2c log n. Assume that for infinitely many x, r does depend on queries of length 
more than 2c log |a;|, i.e., if Q{x) = {qi,q 2 , • . • , qm} and a}, a^, . . . , G (0, 1} 
are such that a' = 1 if G Rk and |gi| < 2clog|a;|, and a' = 0 otherwise, 
then r{{x, (gi,a{), ( 52 , 02)1 • ■ • > (^miOm))) ^ A(a:). Since r is monotone, this may 
happen only for x that belong to A. The set of all such x can be enumerated, by 
assuming that all queries of length greater than 2c log |cc| are not in Rk and as- 
suming that all shorter queries are in Rk, and then computing successively better 
approximations to the correct answers for the short queries by enumerating the 
complement of Rk, until an answer vector is obtained on which r evaluates to 
zero, although x is in A. Note that for better approximations to the true value 
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of i?K) f will still evaluate to zero because r is a monotone reduction. Hence for 
given I, we can find the first x of length more than I in this enumeration. One of 
the queries in Q(x) is of length more than 2clog/ and it belongs to i?x- But we 
can describe every query in Q(x) by c log I + 2 log log I + log 1 + 0{1) bits, which 
is less than 2c log?. That is a contradiction. Since we have established that r 
depends only on queries of length at most 2c log n, we can encode information 
about all strings of this size that belong to Rk into a polynomially large table. 
Thus A is in P/poly. 



Theorem 5. If A is recursive and it reduces to Rk via a polynomial-time f{n)~ 
truth-table reduction then A is in P/(/(n)2^(")^'°s/(«))^ 

Corollary 3. If A is recursive and reduces to Rk via a polynomial-time truth- 
table reduction with 0(log(n)/ log log n) queries then A is in P/poly. 

Corollary 4. Let g{n) be such that ^ 2”. Then there exists a 

recursive A such that A does not reduce to Rk via a polynomial-time g{n) -truth- 
table reduction. In particular for any a < 1 there exists a recursive A that does 
not reduce to Rk via a polynomial-time n°‘ -truth-table reduction. 

Proof of Theorem 5. W.l.o.g. /(n) is unbounded. Let M be the reduction from 
A to Rk that uses at most /(n) queries. Let Q{x) be the query set that M{x) 
generates. We will remove from Q{x) all the strings that have length at least 
Sn = 21og(/(n)) + 2 log log /(n) + c for some suitably chosen constant c. Let 
Q'(x) = Q(x) n{0) 1}^'®" be this reduced set. 

Note that there are at most 2®" strings of length less than s„ and that there 
are at most < (2®")^(”) < possible subsets Q'(x). Partition 

{0, 1}” into equivalence classes, where [a:] = {y : Q'{y) = Q'{x)}. We will show 
that for each equivalence class [x] there is an answer sequence Vx such that, 
for all y G [x], y is in H if and only if M accepts y when the answers to Q{y) 
are answered according to for all of the queries in Q'{y), and all of the long 
queries are answered negatively. 

Thus the advice string consists of an encoding of Vx, which can be written 
using /(n) bits, for each possible set Q'{x). This yields the desired advice bound. 

It remains only to show that the string Vx exists. Assume otherwise. Thus, 
given TO, there is a recursive procedure that finds the lexicographically first string 
X of length n such that log/(n) > to and for all v there is some y„ G [x] on 
which the result of running M (y„) with answer vector v does not answer correctly 
about whether y„ is in A. Let v be the answer sequence for Q'{x) fl Rk, and let 
r be the number of I’s in v (i.e., r is the size of Q'{x) fl Rk)- Thus, given (to, r) 
we can compute Q'{x) and start the enumeration of the complement of Rk until 
we have enumerated all but r elements of Q'{x). Thus we can compute v and 
find yy. Since M{yy) is not giving the correct answer about whether y„ is in 
A, but M does give the correct answer when using Rk as an oracle, it follows 
that Q{yv) contains an element of Rk of length greater than s„. However, this 
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string is described by the tuple (to, r, i), along with 0(1) additional bits. For the 
appropriate choice of c, this has length less than s„, which is a contradiction. □ 

Note that the preceding proof actually shows that, for every x such that [x] 
has “small enough” Kolmogorov complexity, we can pick Vx to be the answer 
sequence for Q'{x) fl Rk- If this were true for every x, then it would follow 
easily that every decidable set A that is reducible to Rk via a polynomial-time 
truth-table reduction is in P/poly. 

3.2 Pathological Universal Machines 

Before presenting the results of this section, we digress in order to introduce 
some techniques that we will need. 

The following development is motivated by a question that one can naturally 
ask: what is the size of (i?K)“"? It is a part of folklore that the number of strings 
in Rk of length n is Kolmogorov random. But is it odd or even? One would be 
tempted to answer that since |(i?K)^"| is Kolmogorov random, the parity of it 
must also be random. The following universal Turing machine [/even shows that 
this is not the case. 

Let [/st be the “standard” universal Turing machine. Consider the universal 
Turing machine [/even defined by: for any d G {0,1}*, [/even(Od) = Ust(d) and 
eleven (Id) = the bit-wise complement of C/gt(d). It is immediate that the size of 
is even for all n. To construct a universal Turing machine [/odd for 
which the size of (Rk^ i® odd for all n (large enough), is a little bit more 

complicated. 

We will need the following definition. For any Turing machine U we can 
construct an enumerator (Turing machine) E that enumerates all pairs (d, x) 
such that U{d) = x, for d,x £ {0,1}*. (The running time of E is possibly 
infinite.) Conversely, given an enumerator E that enumerates pairs (d, x) so 
that if (d, x) and {d,x') are enumerated then x = x' , we can construct a Turing 
machine U such that for any x,d G {0,1}*, U{d) = x if and only if E ever 
enumerates the pair (d, x). In the following, we will often define a Turing machine 
in terms of its enumerator. 

We define Uodd in terms of its enumerator Eodd that works as it is described 
below. Eodd will maintain sets of non-random strings {iVi}jgN during its oper- 
ation. At any point in time, set Ni will contain non-random strings of length i 
that were enumerated by Eodd so far. Eodd will try to maintain the size of sets 
Ni to be odd (except while they are empty.) 

Initialize all {?Vi}igN to the empty set.^ 

For all d G {0, 1}*, run Ust{d) in parallel. 

Whenever Ust{d) halts for some d and produces a string x do: 

Output (Od, x). 

If |0d| < |x| and N| 2 ,| = 0 then set N\x\ ■= {x}. 

^ We assume in the usual way that Eodd works in steps and at step s it initializes the 
s-th set of {Ai}igN to the empty set. Our statements regarding actions that involve 
infinite computation should be interpreted in a similar way. 
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Else if \0d\ < |a;| and x ^ ./V|a;| then: 

Pick the lexicographically first string y in {0, U {4). 
Set A^| 2 ,| := iV| 3 ,| U {x,y} and output (Id, y). 

Continue. 

End. 

It is easy to see that the Turing machine Codd defined by the enumerator 
Eodd is universal. Also it is clear that for all n large enough, is of 

odd size. 

The ability to influence the parity of allows us to (sparsely) encode 

any recursively enumerable information into Rku- We can state the following 
theorem. 

Theorem 6. For any recursively enumerable set A, there is a universal Turing 
machine U such that if C = {0^ : x G A}, then C<'^^^Rku- Consequently, 

A CIku ■ 

Proof. Observe, CC{0^ :zGN}. We will construct the universal Turing 
machine U so that for any integer t > 3, 0^ G C if and only if (i?K[/)^* is of odd 
size. Then, the polynomial time parity reduction of C to Rku can be constructed 
trivially as well as the double-exponential parity reduction of A to Rku ■ 

Let M be the Turing machine accepting the recursively enumerable set C. 
We will construct an enumerator E for U . It will work as follows. E will maintain 
sets {iVijisN during its computations. At any point in time, for every z > 0 the 
set Ni will contain non-random strings of length i that were enumerated by E so 
far and E will try to maintain the parity of \Ni\ unchanged during most of the 
computation. E will also run M on all strings z = 0^ in parallel and whenever 
some new string z will be accepted by M, E will change the parity of Aiog|z| 
by making some new string of length log \z\ non-random. The algorithm for E 
is the following. 

Initialize all {4}ieN to the empty set. 

For all d G {0, 1}* and z G {0^* : z G N}, run Ust{d) and M(z) in parallel. 
Whenever Ust{d) or M(z) halts for some d or z = 0^ do: 

If Ust{d) halts and produces output x then: 

Output (OOd, cc). 

If |00d| < \x\ and x ^ 4a;| then: 

Pick the lex. first string y in {0, — (4a,| U {x}). 

Set A^| 2 ,| := A^Ij-i U {x,y} and output (Old, z/). 

Continue. 

If M(0^’) halts and z > 3 then: 

Pick the lexicographically first string y in {0, 1}* — 4- 
Set Ni := Ni U {y}, and output (l*“^,y). 

Continue. 

End. 

Clearly, enumerator E defines a universal optimal Turing machine and for 
any integer z > 3, 0^ G C if and only if (i?Kc/)^* is of odd size. 
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Parity is not the only way to encode information into i?K- The following 
theorem illustrates that we can encode the information so that one can use 
reductions to extract it. In particular, this proves our Theorem 1. 

Theorem 7. For any recursively enumerable set A, there is a universal Turing 
machine U such that if C = {0^ : x € A}, then Consequently, 

A<gfRKu- 

Proof. First, define a universal Turing machine Uopt as follows: C/opt(0d) = Ust{d) 
and C/opt(ld) = d. Clearly, for any x € {0,1}*, < |a;| + 1. For any 

d G (0, 1}* and any s G (0, 1}^, U is defined as follows: 

On input Ods, run C/opt(d) and if Uopt{d) halts then output Uopt{d)s . 

On input Id do: 

Run Uopt{d), until it halts. 

Let y be the output of Copt(d). 

Check if O^'"' G C. 

If G C then output yO^. 

End. 

It is clear that for any x G (0, 1}*, Kj/(a;) < |x| + 2. Further, for any s, s' G 
(0, 1}® — (O^l, Ku{xs) = Ku{xs'). Finally, for any y G (0, 1}*, 0^'”' G C if and 
only if Ku{yQ^) < Ku{yl^) — 4. Hence, if 0^'“' G C then yO^ ^ i?K- The 
reduction of C to Rk works as follows: on input 0^ , for all y G (0, 1}” ask 
queries yO^. Output 0 if none of the queries lies in Rk and 1 otherwise. 

One could start to suspect that maybe all recursive functions are reducible 
to Rk in, say, doubly exponential time, regardless of which universal Turing 
machine is used to define Rk- We do not know if that is true but the following 
theorem shows that certainly disjunctive truth-table reductions are not sufficient. 

Theorem 8. For any computable time-bound t{n) > n, every set A in REC O 
: A<*Jg^ Rku) is computable in time 0{C{n)). 

Theorem 3 is a corollary of Theorem 8. 

Proof. It suffices to show that for each decidable set A that is not computable in 
time OfC{n)), there is a universal machine U such that A is not <^j"^-reducible 
to Rku- Fix a set A not computable in time 0{C{n)). 

Let Ust be a (standard) universal Turing machine, and define U so that for 
all d, U{00d) = Ust{d). Note that, for every length m, fewer than | of the strings 
of length m are made non-random in this way. 

Now we present a stage construction, defining how U treats descriptions 
d ^ {00|{0, 1}*. We present an enumeration of pairs (d, j/); this defines U{d) = y. 
In stage i, we guarantee that the z-th Turing machine Qi that runs in time t{n) 
(in an enumeration of clocked Turing machines computing reductions) does 
not reduce A to Rku . 
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At the start of stage z, there is a length k with the property that at no later 
stage will any string y of length less than 1^ be enumerated in our list of pairs 
(d,y). (At stage 1, let li = 1.) 

Let T be the set of all subsets of the strings of length less than k. For any 
string X, denote by Qi(x) the list of queries produced by the reduction 

computed by qi on input x, and let Q'{x) be the set of strings in Qi{x) having 
length less than li. 

In Stage z, the construction starts searching through all strings of length k 
or greater, until strings xq and xi are found, having the following properties: 



— Xq ^ A, 

— Xi G A, 

— Q'{xi) = Q'{x 2 ), and 

— One of the following holds 

• Qi{xi) contains fewer than 2™“^ elements from {0, 1}™ for each length 
m > li, or 

• Qi{xo) contains at least 2”^“^ elements from {0, 1}™ for some length 
m > li 

We argue below that strings xg and xi will be found after a finite number of 
steps. 

If Qi(xi) contains fewer than 2™“^ elements from {0, 1}™ for each length 
m > li, then for each string y of length m > li in Qi{xi), pick a different d of 
length m — 2 and add the pair (Id, y) to the enumeration. This guarantees that 
Qi{xi) contains no element of Rku of length > k. Thus if qi is to be a 
reduction of A to Rkuj it must be the case that Q'{xi) contains an element of 
Rku- However, since Q'{xi) = Q'{xg) and xq ^ A, we see that qi is not a <dtt^ 
reduction of A to Rku- 

If Qi{xg) contains at least 2™“^ elements from {0, 1}™ for some length m > 
li, then note that at least one of these strings is not produced as output by 
U (OOd) for any string OOd of length < m — 1. We will guarantee that U does not 
produce any of these strings on any description d ^ {00}{0, 1}*, and thus one 
of these strings must be in Rkuj ^nd hence qi is not a reduction of A to 
Rku ■ 

Let /i+i be the maximum of the lengths of xg,x\ and the lengths of the 
strings in Qi{xg) and Qi{xi). 

It remains only to show that strings xg and xi will be found after a finite 
number of steps. Assume otherwise. It follows that {0, 1}* can be partitioned 
into a finite number of equivalence classes, where y and z are equivalent if both 
y and z have length less than k, or if they have length > k and Q'{y) = Q'(z). 
Furthermore, for the equivalence classes containing long strings, if the class con- 
tains both strings in A and in A, then the strings in A are exactly the strings 
on which qi queries at least 2™“^ elements of {0, 1}'" for some length m > U. 
This yields an 0(t^(rz))-time algorithm for A, contrary to our assumption that 
A is not computable in time 0{t^{n)). 
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Theorem 9. For any computable time-hound t{n) > n, every set A in REC fl 
: ^< 0 "^ -Rk[/} is computable in time 0{t^{n)). 

Due to space limitations, the proof is omitted. 

We conclude with the following observation that is a corollary to Rummer’s 
result [3]. 

Theorem 10. For every universal Turing machine U and every time- 
constructible function t{n) > n, there is a recursive set A ^ DSPACE(t) such 
that A Rk^ ■ 

Proof. Fix any universal Turing machine U and time-bound t(n) > n. By 
Rummer’s result, there is a time-bound t' such that the Halting problem dtt- 
reduces to i?K(/ in time t'{n). W.l.o.g. t'{n) > n. Let A ^ DSPACE(t(t'(2”))) 
be a recursive set. Consider set B = {0* : x G A}. Clearly, 

B ^ DSPACE(t(n)). Since A is recursive, it reduces to Rku via a dtt-reduction 
running in time t'{n‘^), for some constant c. It follows that B Rku- 

4 Conclusions and Open Problems 

Do there exist universal Turing machines Ui and U 2 so that Rku^ and Rku^ are 
not recursively isomorphic? Or so that they are not in the same <m-degree? 

Can one show that not every decidable set is <tt-reducible to Rk (at least 
for some choice of universal machine)? 

Is there a proof of Hypothesis 2? It might be more feasible to prove a related 
hypothesis more in line with Theorems 4 and 5 of Section 3: For any universal 
machine: {A G REC : A<^Rk} C PSPACE/poly 
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Abstract. In this work, we investigate computational complexity of the solu- 
tion existence problem for language equations and language constraints. More 
accurately, we study constraints between regular terms over alphabet consisting 
of constants and variables and based on regular operators such as concatenation 
(•), sum (-I-), Kleene-star (*). We obtain complexity results concerning three re- 
stricted cases of the constraints: for system of language equations in which one 
side does not contain any occurrences of variables in case arbitrary solutions and 
with restriction to finite languages; for constraint in form L<Z R, where R has no 
occurrences of variables. 



1 Introduction 

Language equations can be defined over different sets of operators. But, the concatenation 
must appear among the operators. The properties of language equations have been the 
field of intensive studies for many variants of sets of operators ([8], [12]) and various 
restrictions put on the form of equations ([11], [3]), the domain of solutions ([4]), the 
number of variables, and the size of the alphabet ([11], [9], [5]). Besides, in [11], it 
was considered not only equations but constraints with inclusion operator. The algebraic 
properties of language equations such as the existence of the solution, the problem 
of uniqueness and the existence of the greatest solution, testing whether the system 
of language equations has finitely many solutions were investigated for linear regular 
language equations in [10] and [3]. A language equation is linear if each side of the 
equation has the form of a linear pattern 5'o U S'lA'i U . . . U where Si,...,Sn are 

arbitrary formal languages which are called language constants, Xi... are different 
variables . In [3 ] , they consider systems of linear language equations in which all language 
constants are regular. We call them linear regular language equations. In [3], among 
other things, it was proved that satisfiability problem for linear language equations 
is E X PT I M E-complete and for linear regular language matching this problem is 
P S P AC E-complete. Moreover, for the above mentioned two cases the satisfiability 
has that nice property that if the solution exists then the regular one exists too. 
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The results concerning language equations are of independent interest, they fre- 
quently appear in other research contexts. Below we mention several interesting appli- 
cations of the results: in [5] there have been proved undecidability results for the process 
logic using the result that the satisfiability of equations between regular terms is unde- 
cidable; in [3], [4] there was shown computational complexity of certain cases of the 
unification problem in description logic which was obtained by reduction of the satisfi- 
ability of linear regular language equations; in [2] there was considered the satisfiability 
problem for certain cases of linear regular language equations the motivation of which 
arise from the coding theory if we make the additional assumption that the operations 
must be unambiguous. 

The regular language equations were investigated almost always in the linear case. 
In this paper we consider equations between regular terms based on Kleene’s operators 
•, -f, * in arbitrary form. In the general case the solution existence problem for equations 
between regular terms is undecidable even if only one variable is used. 

We show computational complexity in three cases of language equations and lan- 
guage constraints: for the system of langauge equations in which one side does not con- 
tain any occurrences of variables the satisfiability problem is EXPSPACE-cova^XeXe 
and if we want to decide does there exist solution over finite languages, not over arbitrary 
languages, the problem remains EPSP ACE -compXeif, the satisfiability problem for 
constraint of the form L C R, where R has no occurrences of variables, is PSP AC E- 
complete. In the first, second and the third cases we show that if the solution exists then 
the regular one exists too. Besides, we show that if any solution is maximal then it is 
regular. 

2 Notations and Preliminaries 

Let E, V be two disjoint alphabets. 27 is a finite set of letters (terminal symbols) and V 
is a set of variables. We use the lower-case letters a, b, c, d to denote members of 27 and 
X, Y, U, W to denote variables. 

Regular expressions (over some 27) are usually based on the operator • (concatena- 
tions) and the set operators * (Kleene-closure), + (union). But sometimes are considered 
with additional set operators namely either with intersection or complement. We call 
them respectively semi-extended regular expressions and extended regular expressions. 
Additional operators enable representing regular languages in a more succinct form. 
L{R) denotes the language generated by a regular expression R. 

Let RECEsuv be a set of regular expressions over 27 U )2. We will call them 
open regular expressions. If A{Xi, . . . , X„) G REGE^uv and Xi,..., X„ are variables 
which can occur an arbitrary number of times in term A then one can interpret this term 
as an operator which takes arbitrary formal languages Li, . . . C 27* and provides a 
new language L{A{Li,. ..,Ln)). In the further part of this paper we will consider only 
open and close expressions which are built over 

Definition 1 . Let <> be a set operator from {=, C, D}. An constraint A{Xi, . . . ,X„) O 
B(Xi,. . . ,Xn) is satisfiable over E if there exists nonempty languages Li, . . . ,L„ C 27* 
such that L(A(Li,. . . ,Ln)) O L(i?(Li, . . . ,Ln)). In other words, there exists substitu- 
tion 6 = {Xi •<— Li, . . . ,Xn •<— L„} such that 9(A(Xi, . . . ,X„)) O 0{B(Xi, . . . ,X„)). 
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Let’s note that the satishability problem for the system of regular language equa- 
tions . . . , where 1 < i < m, can be reduced to 

the satishability of a single equation by addition of different strings Ci , . . . , of the 
same length as a prehx to each of side of the equation: = 

CiB''{X \,. . . , A set operator A : (A'*)” — >• X* is monotonic iff for all lan- 

guages Li C Ml, . . . C M„ there holds A(Li,. . . ,L„) C yl(Mi, . . . ,M„). Let’s note 
the following simple fact: 

Fact 2 n, -f, * are monotonic operators and every set operator yl(Ai , . . . , X„) which is 
equivalent to some open regular expression over {n, -f, *, •} is monotonic. 

3 Solving Regular Language Matching 

In this section we will show computational complexity of the following two problems: 

Problem 3 . (RLM)- Regular Language Matching INSTANCE: An equation between 
regular terms 

A(Ai,...,X„)=i?, (1) 

where A(Xi,...,Xn) is an open regular expression and R is a regular expression 
without variables. QUESTION : Is the equation satisfiable ? 

Problem 4 . ( CONSTRAINT) INSTANCE: A constraint between regular terms A(Xi, 
. . . , Xn) C R, where R is a regular expression without variables. QUESTION: Is the 
constraint satisfiable ? 

A regular expression is the synonym of the hnite nondeterministic automaton. 
Throughout this section, we will think about i? as a nondeterministic automaton 
R=< E,s,S,Q, F > where 17 is a hnite alphabet, s is a start state, Q = {qi, . . . ,qk} 
is a set of states, f is a set of accepting states and 5 is a transition function. Without 
loss of generality, we assume that for each q G Q and w G X* there exists p G Q that 
S{q,w)=p. 

Dehnition 5 (String R-profile ). A R-profile of a string w is a set of pairs P^{w) 
= {{q,p) I 6{q,w) =p}. P^{w) can be also interpreted as vector \Di,. . .,Df\ of subsets 
ofQ, where p G Di if and only if{qi,p) G P^{w). 

Dehnition 6 (Language R-pro61e). A R-profile of language L is the following class 
P^{L) = {P^{w) I w G L}. 

Dehnition 7 (R-Projection of a language). A R-projection of a language L on E* is 

a language II^(L) = {w \ w G E* and P^{w) G P^{L)}. 

Propositions. For any nondeterministic automaton R and any language L C E* , 
n^{L) is regular and it is accepted by automaton which has at most exponentially 
increased size compared to R. 

Below, we introduce the dehnition of a new automaton which will characterize 
the behavior of A with respect to the R automaton after the application of possible 
substitution on A(Ai, . . . , A„). 
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Definition 9 (R-profile automaton). A R-profile automaton for a generalized non- 
deterministic automaton A denoting L{A{Li, . . . ,Ln)) is A^ =< 6^, , 

PP >, where = pP{E)U\JZiP^{L,), QP = x sP = [s"^,{s}], pP = 
\IQQ, inP^m, andf^ G P^}. {[h,Ri],[Du...,Dk],[l 2 ,R 2 ]) G (5^ 
if and only if one of the follows holds: (1) there exists aG E such that pP{a) = 
[Di,. . . ,Dn], (li,a,l 2 ) G S'^, and R 2 = IJ^-j | q.g_Rj^} Di> (2) there exists Xj G E^ such 
that [Di,...,Dn] G pP{Lj), G 6^, and R 2 = V}{^\q^(iR^}Di■ 

Lemma 10 . Let R be a nondeterministic finite automaton, AP be the R-profile au- 
tomaton for a generalized nondeterministic automaton A associated with a regular term 
A{Xi, . . . ,Xn) and a certain regular language L{A{Li, . . . ,L„)). The following con- 
ditions are equivalent: 

1. there is no state in U = {[f^,I\ \ I C Q\P} which is reachable from 

2. L{A{L^,...,L^))CL{R), 

3. L(A(7T«(Li),...,7T«(L„)))CL(i?). 

Theorem 11 . The CONSTRAINT problem is PSP ACE -complete. 

Proof First, we show how the CONSTRAINT problem can be decided in 
PS PACE. If the constraint is satisfiable by then one can choose sin- 

gle strings from each of sets wi G Li,...,Wn G and from Fact 2 we have 
L{A{{wi} , ... , {wn})) C L{R). Hence, we may try to guess a vector of sets for each 
Wi such that pP{wf) = , . . . , -D™®] and for each a G E we can easily obtain pP{a) 

out of R automaton. The R-profile automaton for the generalized automaton A denot- 
ing L{A{{w \} , . . . , {w„})) has size at most exponentially greater then A and R. Using 
Savitch’s method [13] for the reachability problem we can decide reachability for two 
states of automaton AP in NLOGSPACE with respect to size of the automaton AP . 
Analyzing definition of AP it is easy to see that it is possible to guess nondeterminis- 
tically (in polynomial space) the index of state and the label for transition. Hence, we 
obtain N PS PACE (= PS PACE) algorithm. 

To prove the lower bound it is enough to observe that the left-hand side of the 
constraint does not contain any occurrences of variables in particular. This means that it 
can be equivalent to E*. Hence, we obtain the CONSTRAINT problem of the form 
E* C L{R) = E* = L{R). But it was proved in [7] that the universality problem for 
regular expression is PS PACE -hard, q 

Theorem 12 . If the RLM problem has solution 9 = {Xi t— Li , . . . , t— L„} then it 
has also the regular solution 6^ = {Xi G- iT^(Li), . . . , t— 7T^(L„)}. 

Proof. Since 0 is a solution of equation (1), the following constraint L{R) C 
L{A{Li, ...,Ln)) holds. From the fact that L C 7T^(L), for any language L, and 
from monotonicity (Fact 2) of A{Xi, ...,Xn) follows that L(A(Li, . . . ,L„)) C 
L{A{nP{Li), ..., nP{Ln))). However, from Lemma 10 and from the assumption 
that L(A(Li, . . . , L„)) C L{R) we obtain L{A{nP{Lf), . . . , iT^(L„))) = 

In case of the satisfiability problem for linear language equations was proved [3] that 
if the system of equations is satisfiable then it has a greatest solution and this solution 
is regular. From above theorem and fact that if we assume Wl = {N C E* \ pP{N) C 
P«(L)}then7T«(L) = UM6VL^AT, one can conclude that: 
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Corollary 13 . Every maximal solution of satisfiable system of regular equations of the 
form (7) is regular. There exists at most double exponentially many maximal solutions in 
the size of{l). 

Lemma 14 . The RLM problem is in EXPSPACE. 

Proof. The algorithm, using exponential space in the size of the regular expres- 
sions R and A{Xi ,. . . ,X„) will he working in the following stages: (1) Convert ex- 
pression R to nondeterministic automaton R (polynomial in the size of the expression). 
(2) Guess, nondeterministically, Rl^ automata recognizing for some potential 

solution 9 = {Xi ^ Li,.. . ^ L„}. We can do it using exponential space because 

RL^ has exponential size in the size of the expression R. (3) Convert pattern automaton 
A for A{Xi ,... , Xn) to A by replacing each of occurrences Xi in A with an automaton 
RL^ . Hence, ^ is a nondeterministic automaton of size exponential in the size of the 
regular expressions R and A{Xi ,. . . , W„). (4) The equivalence problem for two nonde- 
terministic automata is in PS PACE . Therefore checking whether the automaton A 
is equivalent to the automaton R is in EXPSPACE in size of (!)•□ 

In this section we show that the RLM problem is EX PSP ACE -hard. In order 
to obtain this lower bound we exploit the following theorem and a certain detail of its 
proof: 

Theorem 15 [6]. Let SE be a semi-extended regular expression over finite alphabet E. 
The problem, whether SE is equivalent to S* is E X P S P AC E-complete 
Now we introduce a particular class of semi-extended regular expressions. 

Definition 16 . Let = {Ai, . . . ,Ai} be a finite set of regular expressions over some 
finite alphabet S, where each of Ai has the form (Ijli fi”" some positive integer 
k{i). Besides each of is an ordinary regular expression built over E using 
VTe call such expressions Ai intersection atoms. The semi-extended regular expression 
SE is intersection one level if SE is regular expression which is built over some set of 
intersection atoms using the operators •,+,*. 

Below, we propose the conclusion from the proof of Theorem 15. The reader, which 
is interested in verifying this conclusion, should look at the proof of this theorem placed 
in[l]. 

Problem 17. (ONE LEVEL) INSTANCE: An alphabet E, a regular expression S which 
is one level intersection. QUESTION: Is S equivalent to E* ? 

Corollary 18 . The ONE LEVEL problem is EX PSP ACE-hard. 

Now we show how we can reduce the ONE LEVEL problem, in polynomial time, to 
the RLM problem. 

Lemma 19 . The RLM problem is EX PSP AC E-hard. 

Proof. Let us take any semi-extended regular expression SE which is one level 
intersection. Besides, let SE = B(Ai,. . . ,Ai) where B(Xi,. . .,Xi) is an open regular 
expression without any intersection operator, and Ai are intersection atoms. We should 
also assume that variables Xi,...Xi are pairwise different and each variable is associated 
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with a single occurrence of intersection atom in the term B. Let us remember that each 
of Ai atoms has the form nj=i By For each Ai, we will create the system of language 
equations 

= {X, + B] =B]\l<j< k{i)}. (2) 

At the end of the reduction, the equation in the following form will be added 

B={B{Xy...,Xi) = S*}. (3) 

It remains to justify the correctness of our construction, it means - the system of regular 
language equations S = is satisfiable if and only if the expression SE is 

equivalent to X* . 

Obviously, Ai = Hji*! ^ B\. for 1 < A: < k{i) and for each l<i<l . Therefore 
AiAB'- = for any j. Assuming L{B{Ai,. . .,Ai)) = S* we have that substitution 
9 = {Xi ^ L{Ai),...,Xi ^ L{Ai)} satisfies simultaneously B equation and Ai 
equations. In the other direction, (2) and (3) are satisfiable by 9 if and only if 
{9{Xi) C Bj}j^l = 9{Xi) C and by monotonicity of operator B we have 

L{B{Ai ,. . . , A;)) = iA*. Q 

We will be consider, in a section 4 version of RLM problem where we ask 
about existence of solution over finite languages domain (further called FINITE 
MATCHING). It is reasonable to mention just here that one level expression which is 
constructed in the above cited proof [1] has such pleasant property that each intersection 
atom denotes finite language. Hence, the FINITE MATCHING problem is 
EXPSPACE-hw:± 

As a corollary from Lemmas 14 and 19 we obtain 
Theorem 20. The RLM problem is EX PSP ACE -complete. 



4 Finite Solutions 

In this chapter we will show algorithm which decide whether does exists hnite solution 
for equation of type (1). In this place, "finite solution" means language consisting of 
finite set of strings. Due to theorem 12 we know that if some finite solution Lf of (1) 
exists then, this solution is a subset of H^{Lf). Additionally H^{Lf) is the solution 
also. Since /C = {H^{L) \ LC X*} contain finite set of languages then we will split our 
algorithm into two parts. The first part is based on nondeterministic guessing of tuple 
(Li, . . . ,L„) G /C". In the next step we check whether is (Li, . . . ,Ln) solution of (1). If 
the checking returns positive answer (of course, maximal solution need not be unique) 
it remains for us to solve the following problem: 

Problem 21 . (FINITE BASE) INSTANCE: A regular expression A(Li, . . . ,L„). 
QUESTION: Does tuple of finite languages {C\,. . . , Cn) such that C\ C Li , . . . , (7„ C 
L„ and L{A{Li,. . .,L„)) = L(A(C'i, . . . ,Cn)) exist ? 
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Before we show algorithm which solve the FINITE BASE problem, it is worth 
to recall that, by proposition 8, every language can be described as nondeterministic 
automaton of exponential size in the size of input of the matching problem. Hence, a 
tuple of automata for (Li, . . . ,L„) is guessed from double exponential domain and we 
need only exponential counter to storage of information about which tuple of automata 
is currently considered. 

Let A = {s,Q,E,6,x,F) be a finite automaton with coloring x- Further, we will 
think about automaton as directed labelled graph. In turn, Q is a set of vertices (states) 
E is a. set of labelled edges, i5 is a function giving the labelling, s is a distinguished 
vertic which will be called starting, F C i? is a set so called accepting states (vertices), 
X : F — {0, 1}. A path for finite sequence of labels (word) w is a sequence of vertices 
joined by sequence of labels w. For some pair of states p and q and word ru G 27*, it can 
exists more than one path between p and q labelled by letter sequence w. Further, the set 
of such paths will be denoted by Path^{w). Among all paths for any w we distinguish 
accepting paths i.e. those which start from s and end at some accepting vertic. Path{w) 
denotes set of all accepting paths for a string w. If we look at particular path we can 
analyze sequence of colors assigned to sequence of edges associated with the path and 
find maximal length sequence of I's not separated by any 0 occurrences. Such length 
for path p is called weight of path p. And it will be denoted by weight{p). For every 
string w we assign nonnegative integer W eight {w) = TAm{weight{p) \ p G Path{w)}. 
Similarly, Weight‘^{w) = min{weight{p) \ p G Path^{w)} .Letdefine, to further using, 
function H which returns 0 for some path argument p if and only if sequence of edges 
associated withp are colored solely 0. LetD,D^ : E* -G {0, 1} be define in the following 
way: n(w) = 1 'ip p G Path{w) -G !I(p) = 1 and similarly = 1 

ip p G Path‘^{w) — >■ !I(p) = 1. In this section we will consider the problem whether 
does sup{Weight{w) \ w G L{A)} equals oo. We call it SUPREMUM . 

The purpose of introducing the definition of automaton with coloring function is 
reduction the EINITE BASE problem to the SUPREMU M problem. We will give 
here short recipe for obtaining appropriate automaton with coloring. Namely, at the first 
A(Xi, . . . ,Xn) should be translated to nondeterministic automaton A (Ai, . . . , AT„), 
without e— transitions, treating a variable symbols as terminal symbols. Further is as- 
sumed that each of Li does not contain e. Otherwise, one can guess which from Li lan- 
guages contain e and start the translation with A(e + Xi^ + , . . . , Xi ^ ) 

where { Ai , . . . , A„} = { Aj^ , . . . , Xi^}. Next, For every edge p q, where a G X, let 
give a color 0. Next, we take, for each < i < n, nondeterministic automaton without 
e— transitions Mi = {Qi,Si,Si,Ei) corresponding to Li. Let paint the all edges of Mi 

X ■ 

color 1. Next, we starting a process of replacing all edges p — S q in automaton A with 
automaton Mi. This is the crucial point of our reduction. Namely, for each 1 < i < n, 
transition r — ^ fi G Si where fi G Ei and q appropriate to any occurrence of A^, let 
add new edge r q and let set x{f — ^ q) on 0. In this way we obtain automaton 
with coloring A = {Q ,S ,s ,F ,x)- Next, for each triple 1 < i < n, p appropriate to 
any occurrence of Xi and transition r p G 5 we add new edge r Si . We set a 
color of newly-introduced edge x{f — ^ Si) on 0. Next, we remove all edges labelled 
by Ai, . . . , A„. Such created automaton we denote A= {Q,6, s,F,x)- After above con- 
struction may happen that we obtain automaton with starting state which has no outgoing 
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edges even though L{A{Li,. ,.,Ln)) ^ 0. In order to prevent such undesirable situation 
we assume, without loses of generality, that the FINITE BASE problem is stated 
as A{Li ,...,Ln) = aA {Li Ln) ■ Hence a is the only label which goes out of s in 
A. After execution of our procedure it may appear two edges between the same pair of 
vertics which are labelled by the same terminal and colored by differ colors. We allow 
this situation. 

Now we give short justification that sn\){W eight{w) \ w G L{A)} = oo if and only 
if does not exist finite base for A(Li, . . . ,L„). Let assume that finite base (Ci, . . . , C„) 
exist. Let A(ea:p(Li), . . . ,exp(L„))) be regular expression which arise after substitution 
of each occurrence of nonterminal symbols Xi in regular expression A{Xi A„) for 
regular expression exp(Li) denoting language Li. Let der{w) denotes set of deriva- 
tions of a word w from expression A{exp{Li), exp{Ln))) and let sub{d) denotes 
set of subwords in a derivation d which, inside the derivation, are composed solely 
from letters which was derived from the same occurrence of subexpression exp{Li). 
We write max{D) to denote maximal length word from set of words D. It is ob- 
vious that for each w G L{A{Li,. . . ,Ln)) there exists derivation d G der{w) such 
that maximal subword from sub{d) is at most max{\w\ | m G Ci U . . . U C„} long. 
However if finite base does not exist then it means that we can chose infinite se- 
quence of words wq,wi,... G L(A(Li,. . .,Ln)) such that s„ > dn holds for each pos- 
itive integer n and s„ = min({A: | k = max{|w| | w G sub{p)} and p G der(wn)}), 
dn = max({si,...,s„_i}). 

Note that: (1) L{A{exp{Li), . . . ,exp{Ln)))) = L{A ), (2) for each w G 
L{A{exp{Li), . . . ,exp{Ln)))) and for each p G der{w) there exist derivation coun- 
terpart of word w in automata A with properties that every maximal subsequence of 
letters which in derivation p was derived from one of the occurrences exp{Li) is colored 
1 except last letter (which is colored 0, besides a letters which were not derived from 
one of the exp{Li) have color 0) and vice versa. 

Now we define finife algebraic sfructure B = (B,©) for automaton A, and one 
argument function | ] ® : 27* i — ^ B which will be used to describe of behavior of coloring 
on the paths between every ordered pair of states for certain string. B is a set of pairs 
of true-false (zero-one) square matrixes {L,R) for which formula [R]l V \L\l holds for 
each pair (*, j). Let denotes one step reachability matrix for symbol a G 27. It means 
that if we assume that rows and columns of are numbered by linear ordered states 
from Q then at the (p,<?) position of ([i?“]p there is 1 iff {p,a,q) G 5. One can 
extend definition of reachability matrix to strings. Namely i?™, for w G 27*, has 1 at 
the position {p,q) if and only if (p,w,q) G S. Matrix has 1 at the position (p,q) if 
and only if either all paths, for w, from ptoq are colored solely by 1 or (p,w,q) ^ <5. 
What follows that if there is 0 at the position (p, q) in reachability matrix R^ then only 
1 can occur at the position (p, q) in matrix. Function | is defined in such way that 
Im]® = (L*", i?™) for every ru G 27*. More formally 






1 : {p,w,q)G6 

0 : otherwise 



(4) 



[tt = I ^ = 0 or WeighFpiw) = |w;| 

[ lUl Jl Up Q . otherwise 
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Let define operation © on the above defined structure in such way that will be ’’keep 
interpretation” on a result of the operation. 



{A,R)q{B,S) = {Aq^i^B,R-S) (6) 

where T • S' is an ordinary multiplication of the boolean matrixes and 

[AQiB]l = /\ ([S];a [S]«) ^ ([A];a (7) 

reQ 

The operator 0 "keep interpretation" means |wi|® 0 |iU 2 l® = for every wi, 

W 2 G S*.Let |e]® = (U,I), where all elements of U are I’s and I is boolean ma- 
trix which only has I's on diagonal. In other words, | is homomorphism from 
monoid {B* , concatenation, e) to B. The operation © is well-defined because structure 
(B,0, (U,I)) is a monoid. Besides © "keeps interpretation". 

Let define new structure C = (B, ©) and operation © on B set. 

{A, R) © (B, S) = (A ©I B,R-S) 

= A [^]?) ^ 

r^Q 

Now we introduce the second function which describe behavior of coloring on the 
paths for individual strings. Namely, | J*' is defined formally as follows: [7r2(|w]‘')]® = 
[7r2(|tu]®)]p and [7ri(|tt;]‘')]® = It is left to the reader to prove that | "keep 

interpretation" and (B,©, (/,/)) is monoid, where we write line over matrix M to denote 
matrix which arise from M by replacement each position of value 1 with 0 and each 
position of value 0 with 1. 

Let ACM. A-closure of finite monoid AT =< M, o, z >, denoted by (A)+, will be 
the least set containing A and such that if o, 6 G (A)+ thenao&g (A) + .LetU,fL CM 
and composition of two subsets of monoid Ai is defined asUofL={ao&|aGU and b G 
W}. A power of some subset of M is denoted by U*. 

Below we present algorithm which decide the SUPREMUM problem. After this 
presentation we give a sketch of proof completeness, soundness and termination of the 
algorithm. 

INPUT : automaton A with coloring; 

I aG L'}U{/} ; 5= {/} ; Li? = \ a&E}-, 

Factors = {([/,/)} ; Stack = 0 ; 

repeat 

(1) Factors = Factors © LR ; 

(2) Clousure = (Factors)'^ © {{X,X) | AT G S'} ; 

(3) if -^{MYdCiousure V,6F(ki(^)]I ^ ^2(1")]!)) then retuHi false ; 

(4) \f {Factor sx S) ^ Stack then 

(5) Stack = Stack U {Factors x S} 

(6) else return true ; 

(7) S={S-R)C{I}; 
until false ; 
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For each word w accepted by coloring automaton A there is set of accepting paths 
Path{w). sup{Weight{w) \ w G L{A)} < oo if then only if there exist nonnega- 
tive integer rj that for each word w G L(A) there exists path p G Path{w) such that 
weight{p) < rj. Let call such path a witness. Each word w G L(A) of length greatest 
than rj can be factorized on k = | w | div (?7 -f 1 ) factors of length p+l and tail of length 
|w| —K{rj+1) <r](w = wi. ..WKWtaii)- Sometimes v/e willuse termp— factorization to 
emphasize a length of factors. The witness p of w can be factorized according to the fac- 
torization of w (p = Pi . . -PiiPtaii)- From pigeon hole principle we obtain that each part p^ 
for 1 < i < /t of witness is not colored solely by 1. Let f{pi) be the first state of factor 
of p and let Z (pi ) be the last state of Pi. Hence = 0 and7T2(|mi]®)j^*.^) = 1 

for all 1 < f < rc. As far as Wtaii is concerned we use 7r2(|rutaiz]®) x 7i‘2(|rUioi;l®) instead 
of Iwtai/]®- In this way we treat every nonempty ptaii fragment of path containing at 
least one edge colored 0 regardless of it is true. Besides, in this way, we want to ensure 
correct transmission of reachability for any ptaii ■ 

It is clear that if |pi | = e -f 1 for 1 < i < k and 0 < \ptaii \ < e and a sequence 
5 = 7Ti ( |wi] , . . . , 7Ti ( [w«] , 7Ti ( {wtaii }^) consist solely of O's than 
weight{p) < 2e. However, if S contain at least one occurrence of 1 than weight{p) > e. 
It follows that if for some witness p of string w the length e of factors thanks to which I's 
are absorbed is found than this fact show that the weight{p) < 2e. The most important 
question is do we can find such lengfh simultaneously for all witnesses corresponding 
to all w G L(A) ? 

Thanks to the | function and matrix representation we are able to encompass 
behavior of coloring simultaneously on all paths p G Path{w) for any e— factorization. 
[7r2(|rUil®]® A means that for factor Wi a path between p and q exists 

and from among all paths between p and q we can choose such path that not all edges 
are colored 1. The operator 0 detects a path and appropriate sequence 
• • • , , 7Ti ( Iwtaiij^) ) which Contain only O's if only such path exist. 

From above considerations we conclu3e the following lemma: 

Lemma 22 . 

1. For every w S L{A) and its rj— factorization w = wi... w^wtaii 

3/gF “'[[[wil®®---® [[wkI®® (7T2([[wtai/l®) X 7T2(|wta*il®))]{ W eight{w) 

< 2 ??, 

2. p — Factors = {liu]® \ w G A w G E*}; p — L{A) Factorizations = 

{liuil® (g) ... (g) I wta^l G ^ U\1=i G L”') A 

Wi... Wf^Wtaii G L(A) },• {p — Factors)'^; = lJfc=o where R = {R°‘ \ aG 
E} are finite sets and they have size with upper bound which depend solely on the 
size of automaton A, 

3. p — L{A) Factorizations C (p — Factors)'^, 

4. Let C = {p — Factor s)^iS) R^^ where R<^ = {{X,X) \ X G R^^}, 

D = p — L{ A) Factorization (g) R^"^. A formula = \Jy<zC VgeF([^i(^)]I ^ 
[7r2(H)]f) is equivalent to \/y^D VqeF(['^l(^)]s , 

5. -<(!> if and only if there exists 1 < p < oo such that for every string w G L{A) there 
exists witness p G Path{w) of weight at most p. 
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So to find good length 77 + 1 of factors for all strings in L{A) we start infinite loop 
which increase length of factors hy one per one execution of the loop - see line ( 1 ). 
Each next value of = S' is computed in line (7). In the line (2) we compute set of 
matrixes C and next we compute value of negated formula (line (3)). As for generalized 
alternative symbols appearing in line (3) it can he simulated by ’’for” loop. A counter 
of the ’’for” loop is proportional to logarithm of power of domain which underline the 
alternative operation occurrence. If calculated value amounts to true then we return 
false what means that sup{Weight{w) \ w G L{A)} < 00 . It remains to prove that 
our algorithm terminates. Due to above lemma Stack variable is bounded by the size 
of automaton A. Let note that if we can not add to Stack some object in some step of 
our algorithm then next computed Factors's and Clouser's will repeat itself. Hence, 
if value of formula from line (3) was false until now then it will remain false in the 
next steps. This is time to break our computation with answer true what means that 
sup{Weight{w) \ w G L{A)} = 00 . We estimate maximal number of different set of 
2„2 

Factors at 2^ where n is number of states of automaton A. A size of S set amounts to 

2 

at most 2" and because during execution, size of S always is increased than we obtain 
2 

at most 2” different occurrences of S. Therefore, the "repeat" loop is executed at most 

2 2n^ 

2" times. Hence, our algorithm works in double exponential time in the size of A. 

What follows that it solves the FI NIT E M AT CHI NG problem in triple exponential 
time in the size of equation ( 1 ) . 

Our algorithm can be improved by more reasonable computation of Factors and 
the S set. Namely, we can increase mentioned sets two times in every step. After this 
improvement the SUPREMUM can be solved in exponential time. 

Lemma 23 . Let \ = k = 22"^ r = 2”^ ^o- R={R°-\a& 

1- {LP)% C 

2. {{LP)^q)+ C {{LP)<^P+^>)+ = {{LP)<^’^^>)+, 

3. R<^ = R<^. 

Lemma 24. Let F{Y) = Ch = ((LP)^+')>K 

CI 2 = The following condition holds ^ 

Proof Let assume that in the SUPREMUM problem we will apply nonuni- 
form length factors. Namely, each of words w G L{A) will be factorized on factors of 
length at least A + 1 and at most A + k. Let repeat the reasoning which proved sound- 
ness and completeness of the algorithm. Firstly, if the SU PREMU M problem has 
negative solutions then for every accepted string there exists witness p of weight less 
then 2(A -f k) -f 1 - inside each of factors for witness p at least one color 0 will ap- 
pear. On the other hand if the SUPREMUM has positive answer then there exists 
string w G L{A) such that all witnesses for w are at least of A weight. Therefore, if 

Cig = ® then VreC/i '^(^) ^ VyeCts holds. And from 

lemma 23 we obtain VyeCig ^ VyeC /2 
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Due to lemmas 23 and 24 we can write algorithm solving the SUPREMUM 
problem which works in polynomial space. Idea of the algorithm is the following: let 
apply well known Savitch’s reachability method everywhere it is possible. Namely, one 
can recognize (1) does any matrix M G { | w G UfcT using 0(log^(«:)) space, 
(2) does any M G using 0(log^(/r)) space assuming that we use oracle 

frompoint(l) to recognize of factors, (3) does any M G i?^^("fixedpointforreachability 
set") using 0(log^(r)) space. The fact that structures B, C and (R,-,/), where R is a 
set of boolean matrixes are monoid (especially associativity) plays fundamental role for 
soundness of sketched algorithm. 

Theorem 25 . The FINITE MATCHING problem is EX PSP ACE-complete. 
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Abstract. We consider the problem of schednling jobs on related ma- 
chines owned by selfish agents and provide the first deterministic mecha- 
nisms with constant approximation that are truthful; that is, truth-telling 
is a dominant strategy for all agents. More precisely, we present deter- 
ministic polynomial-time (2 -|- e)-approximation algorithms and suitable 
payment functions that yield trnthful mechanisms for several NP-hard 
restrictions of this problem. Our result also yields a family of deter- 
ministic polynomial-time truthful (4 + e)-approximation mechanisms for 
any fixed nnmber of machines. The only previously-known mechanism 
for this problem (proposed by Archer and Tardos [FOGS 2001]) is 3- 
approximated, randomized and truthful under a weaker notion of truth- 
fulness. 

Up to our knowledge, onr mechanisms are the first non-trivial 
polynomial-time deterministic truthfnl mechanisms for this NP-hard 
problem. 

To obtain our results we introduce a technique to transform the PTAS 
by Graham into a deterministic truthful mechanism. 



1 Introduction 

The Internet is a complex distributed system where a multitude of heteroge- 
neous entities (e.g., providers, autonomous systems, universities, private compa- 
nies, etc.) offer, use, and even compete with each other for resources. Resource 
allocation is a fundamental issue for the efficiency of a complex system. Several 
efficient distributed protocols have been designed for resource allocation. The 
underlying assumption is that the entities running the protocol are trustworthy; 
that is, they behave as prescribed by the protocol. This assumption is unrealistic 
in some settings as the entities owning the resources might try to manipulate 
the system in order to get some advantages by reporting false information. For 
example, a router of an autonomous system can report false link status trying 
to redirect traffic through another autonomous system. 

With false information even the most efficient protocol may lead to unrea- 
sonable solutions if it is not designed to cope with the selfish behavior of the 

* Work supported by the European Project IST-2001-33135, Gritical Resource Sharing 
for Gooperation in Gomplex Systems (GRESGGO). 



V. Diekert and M. Habib (Eds.): STAGS 2004, LNCS 2996, pp. 608—619, 2004. 
© Springer- Verlag Berlin Heidelberg 2004 
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single entities. The field of mechanism design provides an elegant theory to deal 
with this kind of problems. The main idea of this theory is to pay the agents to 
convince them to perform strategies that help the system to optimize a global ob- 
jective function. A mechanism M = (A, P) is a combination of two elements: an 
algorithm A computing a solution and a payment rule P specifying the amount 
of “money” the mechanism should pay to each entity. Informally speaking, each 
agent i has a valuation function that associates to each solution X some value 
Vi{X) and the mechanism pays i an amount Pi{X,ri) based on the solution X 
and on the reported information r^. A truthful mechanism is a mechanism such 
that the payments guarantee that, when X = X(ri) is the solution computed 
by the mechanism, Ui := Pi{X, ri) + Vi{X) is maximized for ri equal to the true 
information (see Sect. 1.3 for a formal definition). 

Recently, mechanism design has been applied to several optimization prob- 
lems arising in computer science, networking and algorithmic questions related to 
the Internet (see [10] for a survey). In the seminal papers by Nisan and Ronen [8, 
9] (see also [11]) it is first pointed out that classical results in mechanism design 
theory, originated from micro economics and game theory, do not completely fit 
in a context where computational issues play a crucial role [9]. 

The main purpose of this paper is to provide polynomial-time approxima- 
tion truthful mechanisms for the problem of scheduling jobs on parallel related 
machines (QUCmax)- 



1.1 Previous Work 

The theory of mechanism design dates back to the seminal papers by Vickrey [12], 
Clarke [4] and Groves [7]. Their celebrated VCG mechanism is still the promi- 
nent technique to derive truthful mechanisms for many problems (e.g., shortest 
path, minimum spanning tree, etc.). In particular, when applied to combinato- 
rial optimization problems (see e.g., [8,11]), the VCG mechanisms guarantee the 
truthfulness under the hypothesis that the optimization function is utilitarian^ 
and the mechanism is able to compute the optimum. 

Unfortunately, none of these hypothesis holds for QUCmax since we aim at 
minimizing the maximum over all machines of their completion times, and the 
problem is N P-hard [5] . 

In [2] the authors characterize those algorithms which can be turned into a 
truthful mechanism for Q\\Cma,x- Their beautiful result brings us back to “pure 
algorithmic problems” as all we need is to find a good algorithm for the original 
problem which also satisfies the additional monotonicity requirement: increasing 
the speed of exactly one machine does not make the algorithm decrease the work 
assigned to that machine (see Sect. 1.3 for a formal definition, and Theorem 7 
below). The authors then provide (i) an exact truthful mechanism based on the 
the algorithm computing the (lexicographically minimal) optimal solutio and 



^ A maximization problem is utilitarian if the optimization can be written as the sum 
of the agents’ valuation functions. 
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(ii) a randomized 3-approximation mechanism that is truthful in expectation, a 
weaker notion of truthfulness. 

Nisan and Ronen [8,11] considered the unrelated machines case and provide 
an n-approximation deterministic truthful mechanism for it (n is the number of 
machines). Rather surprisingly, this mechanism is optimal for the case of n = 2. 
For the case n > 2 [8,11] prove that a wide class of “natural” mechanisms cannot 
achieve a factor better than n, if we require truthfulness. Finally, for n = 2 [8, 
11] give a randomized 7/4-approximation mechanism. 

There is a significant difference between the definition of truthfulness used 
in [8,11] and the one used in [2]. Indeed, the randomized 7/4-approximation 
algorithm in [8,11] yields a truthful dominant strategy for any possible random 
choice of the algorithm. In [2], instead, the notion of utility is replaced by the 
expected utility one: even though the expected utility is maximized when telling 
the truth, for some random outcome, there might exist a better (untruthful!) 
strategy. 

This idea is pushed further in [1] where one parameter agents are considered 
for the problem of combinatorial auction. In this work, truthfulness is achieved 
w.r.t. expected utility and with high probability, that is, the probability that an 
untruthful declaration improves the agent utility is arbitrarily small. 



1.2 Our Contribution 

It is natural to ask whether some problems require some relaxation on the defi- 
nition of truthfulness in order to achieve polynomial-time approximation mecha- 
nisms. In this paper we investigate the existence of truthful polynomial-time ap- 
proximation mechanisms for QHCmax, while maintaining the strongest definition 
of truthfulness: truth-telling is a dominant strategy over all possible strategies of 
an agent. 

We first show that, for any fixed number of machines, QHCmax admits a de- 
terministic truthful (2 -|- e)-approximation mechanism if there exists a monotone 
allocation algorithm Gc whose cost is within an additive factor of 0(tmax/si) 
from the cost of Greedy, where tmax is the largest job weight and si is the 
smallest machine speed (see Sect. 2). Our result is a modification of the clas- 
sical PTAS [6]. Notice that this PTAS cannot be used to construct a truthful 
mechanism because Greedy is not monotone and the allocation produced by 
the combination of the two algorithms (the optimal and the greedy one) is also 
not monotone. Our technical contribution here is the analysis of a new algo- 
rithm obtained by combining the optimal algorithm and Gc, that preserves the 
monotonicity and whose cost is within a factor of 2 of the cost of the PTAS. 

We then show that such a monotone algorithm Gc exists for the following 
versions of the problem (see Sect. 3): 

— speeds are integer and the largest speed is bounded from above by a constant; 

— speeds are divisible, that is, they belong to a set C = {ci, C 2 , . . . , Cp, . . .} such 
that for each i, Ci+i is a multiple of c^. 
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Thus, for both these cases, we obtain a family of deterministic truthful (2 + 
e)-approximation mechanisms (see Sect. 4). Observe that all such restrictions 
remain NP-hard even for two machines [5]. Up to our knowledge, this is the 
first result in which approximate solutions yield truthful mechanisms, where 
truthfulness is defined in the strongest sense. Indeed, the mechanism in [2] is 
only truthful on average. Although our new algorithm is relatively simple, its 
analysis, in terms of monotonicity and approximability, is far from trivial and 
goes through several properties of greedy allocations on identical machines. 

We emphasize that the importance of an approximating mechanism for the 
case of divisible speeds is both practical and theoretical. Indeed, on one hand, in 
many practical applications “speeds” are not arbitrary but they are taken from a 
pre-determined set of “types”, yielding values that are multiple with each other. 
Moreover, this result implies the existence, for any fixed number of machines, of 
deterministic truthful (4 + e) -approximate mechanisms for the case of arbitrary 
speeds, for any e > 0. 

Observe that, also in the case of divisible speeds, existing and natural ap- 
proximation algorithms are not monotone, and thus they are not suitable for 
truthful mechanisms (see [3] for a discussion). 

Finally, our mechanisms satisfy voluntary participation and are able to com- 
pute the payments within polynomial time (see Sect. 4). The latter is a property 
that cannot be directly derived from the results in [2] . 

Due to lack of space some proofs are omitted or only sketched. We refer the 
interesting reader to the full version of this work [3]. 

1.3 Preliminaries 

We consider the problem of scheduling on related parallel machines (<5| ICmax)- 
We are given the speed vector s = {si, S 2 , ■ ■ ■ , Sn), with si < S 2 < . . . < s„, of the 
of n machines and a job sequence with weights a = (ti, t 2 , ■ • ■ > tm)- In the sequel 
we simply denote the i-th job with its weight ti. The largest job weight in a is 
denoted by tmax- A schedule is a mapping that associates each job to a machine. 
The amount of time to complete job j on machine i is tjjsi. The work of machine 
i, denoted as Wi, is given by the sum of the weights of the jobs assigned to i. The 
load (or finish time) of machine i is given by Wi/si. The cost of a schedule is the 
maximum load over all machines, that is, its makespan. Given an algorithm A 
for QIIGniax, A((j, s) denotes the solution computed by this algorithm on input 
the job sequence cr and the speed vector s. The cost of the solution computed by 
algorithm A on input a and s is denoted by cost(A, cr, s). We will also consider 
scheduling algorithms that take as third input the parameter h. In this case we 
denote by A{a, s, h) the schedule output and by cost(A, cr, s) its cost. 

We consider Q||Cniax in the context of selfish agents in which each machine 
is owned by an agent and the value of Si is privately known to the agent. A 
mechanism for this problem is a pair M = (A,P), where A is an algorithm to 
construct a solution and P is a payment function. In particular, the mechanism 
asks each agent i to report her speed and, based on the reported costs, constructs 
a solution using A and pays the agents according to P = (Pi, P 2 , . . . , P„). The 
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profit of agent i is defined as profit^ = Pi — Wi/si, that is, payment minus the 
cost incurred by the agent in being assigned work Wi. 

A strategy for an agent i is to declare a value bi for her speed. Let b-i denote 
6i, &2j • ■ • j bi-i,bi+i, . . . ,bn- A strategy bi is a dominant strategy for agent i, if 
bi maximizes prof it i for any possible b_i. A mechanism is truthful if, for any 
agent i, declaring her true speed is a dominant strategy. A mechanism satisfies 
voluntary participation if, for any agent i, declaring her true speed yields a non- 
negative utility. 

An algorithm for the QHCmax problem is monotone if, given in input the ma- 
chine speeds 6i, 62 , . . . , 6n, for any i and fixed b-i, the work Wi is non decreasing 
in bi. 

Given a sequence cr of m jobs, we denote by ah the subsequence consisting 
of the first h jobs in a, for any h < m; moreover, a\ah denotes the sequence 
obtained by removing from a the h first jobs. 

The Greedy algorithm (also known as the ListScheduling algorithm [6]) 
processes jobs in the order they appear in a and assigns a job tj to the machine 
i minimizing (wi + tj)/ Si, where Wi denotes the work of machine i before job tj 
is assigned; if more than one machine minimizing the above ratio exists then the 
one of smallest index is chosen. 

An optimal algorithm computes a solution of minimal cost opt((T, s). Through- 
out the paper we assume that the optimal algorithm always produces the lex- 
icographically minimal optimal assignment. As shown in [2], this algorithm is 
monotone. 

An algorithm A is a c-approximation algorithm if, for every instance (cr, s), 
cost(A, cr, s) < c-opt(cr, s). A polynomial-time approximation scheme (PTAS) for 
a minimization problem is a family A of algorithms such that, for every e > 0 
there exists a (1 -I- e)-approximation algorithm G A whose running time is 
polynomial in the size of the input. 



2 Combining Monotone Algorithms with the Optimum 

In this section we show how to combine an optimal schedule on a subsequence of 
the jobs with the one produced by a monotone algorithm on the remaining jobs 
in order to obtain a good monotone approximation algorithm. Our approach is 
inspired by the PTAS of R. L. Graham [6] that can be described as follows. 
First, we optimally assign the h largest jobs. Then, we complete this assignment 
by running Greedy on the remaining jobs according to the work assigned to the 
machines in the previous phase. 

Unfortunately, this PTAS is not monotone. Indeed, even though the first 
phase is monotone, it is easy to see that Greedy is not monotone [2]. Moreover, 
even if we replace Greedy with a monotone algorithm the resulting algorithm is 
not guaranteed to be monotone. We, instead, propose the following approach. 

Let Gc be any scheduling algorithm. By Opt-Gc we denote the following al- 
gorithm. 
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Algorithm Opt-Gc 

Input: a job sequence cr, speed vector s, and parameter h. 

Assume that the jobs in a are ordered in non-increasing order by weight. 

A. compute the lexicographically minimal schedule among those that have op- 
timal makespan with respect to job sequence ah and speed vector s; 

B. run algorithm Gc on job sequence a\ah and speed vector s assuming that 
machine i has initial load 0, i = 1, • • • , n; 

output the schedule that assigns to machine i the jobs assigned to machine i 
in Phase A and Phase B. 

We have the following lemma. 

Lemma 1. If Gc is monotone then Opt-Gc is also monotone. 

In the next sections we show that, if Gc has an approximation factor close to 
the one of the greedy algorithm, then, for each e > 0 and for each number n of 
machines, it is possible to choose the value of the parameter h so that Opt-Gc 
outputs a schedule of makespan at most (2 -I- e) times the optimal schedule. 

We start by defining the notion of a greedy-close algorithm. 

Definition 1 (greedy-close algorithm). Let c be a constant. An algorithm 
Gc is c-greedy-close if, for any job sequence a and any machine speed vector 
s = (si, S 2 , . ■ . , Sn), cost(Gc, (7, s) < cost(Greedy, cr, s) -\-c-tmaxf si. An algorithm 
Gc is greedy-close if it is c-greedy-close for some constant c. 



2.1 Approximation Analysis of Opt-Gc 

In this section, we show that the approximation factor of Opt-Gc is at most 
twice the approximation factor of PTAS-Gc, where PTAS-Gc computes the op- 
timal schedule on the h largest jobs and then combines it with a greedy-close 
solution computed using algorithm Gc. Moreover, in order to guarantee a “good” 
approximability, it makes a balancing step in Phase B where jobs are assigned to 
non-bottleneck machines to reduce the unbalancing, while keeping the solution 
optimality. 

Algorithm PTAS-Gc 

Input: a job sequence a, speed vector s, and parameter h. 

Assume that the jobs in ah are the h largest jobs of a. 

A. compute the lexicographically minimal schedule among those that have op- 
timal makespan with respect to job sequence ah and speed vector s; 

let opt(cT;i, s) be the makespan of the schedule produced in this phase; 

B. reduce unbalancing without increasing cost by running algorithm Greedy as 
long as it is possible to add jobs without exceeding opt((T/i, s) and let h' be 
the last job considered in this phase; 

C. run algorithm Gc on job sequence cr \ ah' and vector speed s assuming that 
machine i has initial load 0, for i = 1, ■ ■ ■ ,n; 
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output the schedule that assigns to machine i the jobs assigned to machine i 
in phases A, B and C. 

Let PTAS-Greedy be algorithm PTAS-Gc with Gc = Greedy. We de- 
fine the quantity cost(PTAS-Greedy, ct, s, /i) = opt((T/i,s) -P cost(Greedy, ct \ 
where h' is the value computed in Phase B. It is easy to see that 
cost(PTAS-Greedy, CT, s, /i) > cost(PTAS-Greedy, ct, s, /i). Moreover, let Greedy* 
denote the algorithm that, on input ct and s = (si,...s„), returns as output 
the best schedule among those computed by Greedy on input ct and speed vec- 
tors (0, . . . , 0, Sfc, . . . , Sn) for k = 1, . . . ,n. Let us also define cost(Gc, ct, s, a) := 
cost(Greedy*, CT, s) -P (1 -P c)tmax/a. 

It is then possible to prove the following results: (i) cost(Greedy, ct, s) < 
cost(Greedy*, CT, s) -P tmaa/si, (h) cost(Gc,CT, s) < cost(Gc, ct, s, si), and (iii) 
cbS(Gc, CT, (0, . . . , 0, Sfc, Sfe+i . . . , s„), si) < cbS(Gc, ct, (0, . . . , 0, Sk+i . . . , s„), si). 

To upper bound the cost of PTAS-Gc, we consider the following quantity: 



cost(PTAS-Gc, CT, s, h) := opt{ah, s) + cost(Gc, ct \ ah',s, si). 



where h' is the index of the last job considered in Phase B of PTAS-Gc. Because 
of (ii) above, we have that cost(PTAS-Gc, ct, s, h) > cost(PTAS-Gc, ct, s, h). 

The next two lemmas provide an upper bound on cost(PTAS-Gc, ct, s, h). 

Lemma 2. For any job sequence a, any h, and any speed vector s of length n 

opt(CT,s) 



cost(PTAS-Greedy, ct, s, h) < cost(PTAS, ct, s, h) + 



h ■ Si 



^Si (n- 1). 



Lemma 3. If Gc is c-greedy-close, then for any job sequence a, any h, and any 
speed vector s of length n 



, , ,, (1 -P c) • optfcr, s) 

cost(PTAS-Gc, CT, s, h) < cost(PTAS-Greedy, ct, s, h) H s^. 



h ■ Si 



i=l 



We next provide a bound on the cost of PTAS-Gc in terms of opt((j, s) and 

Sn/ S\. 

Theorem 1. If Gc is c-greedy-close then, for any job sequence ct, any h, and 
any speed vector s of length n, 

, ,, , , /, f (n) -\- -\- c ■ n Sn 

cost(PTAS-Gc, CT, s, h) < optlcr, s) ( 1 -P ; 

V h Si 

Proof. By previous lemmata we have 

cost(PTAS-Gc, CT, s, h) < cost(PTAS, ct, s, h) -P 



opt(g, s) 

h ■ Si 



^s, j (n- 1) 



0=1 




Deterministic Truthful Approximation Mechanisms 



615 



+ (1 + c). 



OPt(<T,s) 

h ■ Si 






< opt(a, s) 

< opt(a, s) 



f{n) n + cy- 
^ 1=1 / 



1 + 



/(n) + + c ■ n Sr, 



h 



Si , 



where the last inequality follows from cost(PTAS, a, s, h) < opt((j, s) ^1 + j 
(see [6]) and Si < Sn- □ 

The bound given by Theorem 1 is good for small values of s„/si. When 
instead, s„ is much larger than si it might be convenient to neglect the machine 
with speed si and run instead PTAS-Gc only on the remaining n — 1 machines. In 
the next theorem, we prove that in this way we can obtain ( 1 + e) approximation 
for any value of e > 0. The proof of theorem is based on the following technical 
lemma. 

Lemma 4. If Gc is greedy-close, then for all a, h and s = (si, S2, . . . , s„) 
cost(PTAS-Gc, cr, (si, S 2 , . . . , s„),h) < cost(PTAS-Gc, a, (0, S 2 , . . . , s„), h). 



Theorem 2. For any positive integer n and for any e > 0, if Gc is a polynomial- 
time greedy-close algorithm, then there exists an h such that, for all a and for 
all speed vectors s of length n, cost(PTAS-Gc, a, s, /i) < (l + e)opt(cr, s). Moreover, 
the running time o/ PTAS-Gc is polynomial in m= \u\. 

Proof. We will prove by induction on n that for any e > 0 there exists an h, 
depending on e and n only, such that cost(PTAS-Gc, a, s, h) < (1 -P e)opt(a, s). 
The base case n = 1 is trivial. For the inductive step assume that, for 
any e > 0, there exists h such that cost(PTAS-Gc(cr, (0, S 2 , ■ ■ ■ , s„), /i) < (1 -P 
e)opt((T, (0, S 2 , . . . , Sn)). If s„/si < c, then by Theorem I, it is possible to pick 
h = h{n,e) so that cost(PTAS-Gc, cr, s, h) < (1 -P e)opt(cr, s). Otherwise, pick e' 
such that (1 -P e')(l -P si/s„) < (1 -P e). Then by Lemma 4 and by inductive 
hypothesis it is possible to choose h' = h'{n — 1, e') such that 

cost(PTAS-Gc, a, (si, S 2 , . . . , s„), /i') < cost(PTAS-Gc, cr, (0, S 2 , . . . , s„),h') 

< (1 -P e')opt(cr, (0, S 2 , . . . , Sn)) 

< (1 -P e')(l -P Si/s„)opt(cr, (si, S 2 , . . . , Sn)) 

< (1 -P e)opt(cr, s). 

Finally, the running time is -P mlogm -P poly(m)). □ 

We are now ready to prove the main result of this section. 
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Theorem 3. For any positive integer n and for any e > 0, z/Gc is greedy-close, 
then there exists an h = h{n, e) such that for all sequences of jobs a and all speed 
vectors s of length n, cost(Opt-Gc, a, s) < (2 + e)opt(CT, s). 

Proof Sketch. Fix e > 0. Let h = h{n, e) be such that cost(PTAS-Gc, a, s, h) < 
(1 + e/2)opt(cr, s) (such an h exists by Theorem 2) and let h' be the index of 
the last job scheduled during phase B by algorithm PTAS-Gc on input cr, s, and 
h. Construct a new job sequence a' from a by adding, just after th>, a copy of 
the jobs from th+i to th'- It is possible to prove that the cost of the schedule 
produced by PTAS-Gc on input cr', s, and h is not less than the cost of the schedule 
produced by Opt-Gc on cr, s, and h (see [3]). 

We observe that the set of new jobs, considered independently from the rest 
of the sequence, can be scheduled in time o pt (cr, s) (using the same schedule 
computed in phase B of PTAS-Gc) and thus opt(cr' ,s) < 2opt(CT, s). Then, we 
have 



cost(Opt-Gc, cr, s, h) < cost(PTAS-Gc, a' , s, h) 

< (1 -I- e/2)opt(CT', s) < (2 -T e)opt(cr, s) 



and the theorem follows. □ 

3 A Monotone Greedy-Close Algorithm 

In this section we describe a greedy-close algorithm that is monotone for the 
case of “divisible” speeds (see Def. 2 below). We present our algorithm for the 
case of integer divisible speeds; this is without loss of generality, as in case the 
divisible speeds are not integers then they can be scaled to be integers. 

Let us consider the following algorithm: 

Algorithm uniform 

Input: a job sequence cr and speed vector s = {si, S2, ■ ■ ■ , Sn), with si < S2 < 

• • • < s„. 

A. run algorithm Greedy on job sequence cr and S = identical machines; 

B. order the identical machines by nondecreasing load li, . . . ,ls; 

C. let g := GCD{s\, S 2 , . . . , s„) and split the identical machines into g blocks 
Bi, - ■ ■ ,Bg each consisting of S/g consecutive identical machines. For 1 < 
i < g and 1 < A: < S/g, denote by Bi{k) the fc-th identical machine of the 
z-th block. Thus identical machine Bi{k) has load l/^i_iys/gJrk- 

D. for 1 < j < n let kj = si/g; then machine j receives the load of 

identical machines Bi{kj -P 1), • • • , Bi{kj -P sj /g), for each block 1 < z < g; 

As it is described above, algorithm uniform does not run in polynomial time 
as its running time depends on S which, in general, is not polynomially bounded 
in n and m. However, uniform can be easily modified so to obtain the same 
allocation in 0{n ■ m m log m) time. 
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3.1 Approximation Analysis of uniform 

Let us denote by the work of the Sj / g identical machines from block Bi whose 
loads are assigned to machine j. Then we have that hi-i)-S/g+k- 

Theorem 4. For any job sequence a and any integer speed vector s = (si, 
S 2 ,---,Sn) it holds that cost(unif orm, cr, s) < opt(CT, s) + tmax/5, where g = 
GCD{si, S 2 , ■ ■ ■ , s„). 

When Si divides all s^s, we have that g = Si and the uniform algorithm is 
greedy-close. We then define sequences of speeds that enjoy this property, which 
will be used below to prove the monotonicity of uniform. 

Definition 2 (divisible speeds). Let C = {ci, C 2 , . . . , Cp, . . .}, with the prop- 
erty that Ci divides Ci+\. Then a speed vector s = (si, S2, . . . , s„) is divisible if 
s € C”. The restriction to divisible speeds denotes the problem version in which 
the set C is known to the algorithm and all declared speeds must be in C. 

We thus have following theorem. 

Theorem 5. Algorithm uniform is greedy-close when restricted to divisible 
speeds. 

3.2 Algorithm uniform Is Monotone 

In order to prove the monotonicity of algorithm uniform we first prove some 
technical results on greedy allocations on identical machines. 

Lemma 5. Let Li (respectively, U) denote the load of the i-th least loaded ma- 
chine when Greedy uses N (respectively, N -\-\) identical machines. Lt then holds 
that < Li, for all 1 < i < N . 



Lemma 6. Let Li (respectively, U) denote the load of the i-th least loaded ma- 
chine when Greedy uses N (respectively, N' > N ) identical machines. Lt holds 
that Li <li-\- h+i, for 1 < i < N. 

Proof. We prove the lemma for N' = N -\- 1 since this implies the same 
result for any N' > N. For any 1 < z < fV, Lemma 5 yields X)fe=i > 

Z)fe=i ^k and Y.k=i-ei > Y.k=i+i thus implying Li<k-\- k+i. □ 

Lemma 7. Let Li (respectively, U) denote the load of the i-th least loaded ma- 
chine when Greedy uses N (respectively, N' > N ) identical machines. For any 
a, b, b' such that N — b < N' — b' it holds that 

b b' 

W(a, 6) := ^ L* > W'(a -k 5' - 5, &') := 

i—a i=a-\-b' — b 
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Proof. Let d = N' — N . By repeatedly applying Lemma 5 we obtain Li > k+d, 
for 1 < i < N. Since b' — b < N' — N = d, it holds that Li > li^d > k+b'-bj for 
I < i < N. This easily implies the lemma. □ 

We can now prove that uniform is monotone. Intuitively, if an agent increases 
her speed, then the overall work assigned to the other agents cannot decrease. 

Theorem 6. Algorithm uniform is monotone when restricted to divisible 
speeds. 

4 Polynomial-Time Mechanisms 

Computing the payments. We make use of the following result: 

Theorem 7 ([2]). A decreasing output function admits a truthful payment 
scheme satisfying voluntary participation if and only if /g biWi{b-i,u)du < oo 
for all i,b-i. In this case, we can take the payments to be 

poo 

Pi{b_i,bi) = biWi{b-i,bi) + / biWi{b-i,u)du. (1) 

Jo 

We next show how to compute the payments in Eq. (1) in polynomial time 
when the work curve corresponds to the allocation of PTAS-unif orm. 

Theorem 8. Let A be a polynomial-time r -approximation algorithm. It is pos- 
sible to compute the payment functions in Equation (1) in time poly(n, m) when 
(i) all speeds are integer not greater than some constant M, and (ii) speeds are 
divisible. 

Proof. Observe that since A is an r-approximation algorithm there exists a 
value S < r ■ S, where S = such that on input (s_j, S), the algorithm 

assigns all jobs to machine i. Then, in order to compute the work curve of 
machine i we have only to consider speed values in the interval [0,5']. Since A 
runs in polynomial time, if speeds are integer, it is always possible to compute 
the work curve within time 0(5 • poly{n,m)). When all speeds are not larger 
than M, we have that 5 G 0(n ■ M) and the first part of the theorem follows. 

Suppose now that speeds are divisible. In this case all the speeds belong to 
the interval [2“*,2*], where I is the length in bits of the input. Then, there are 
0(log2^) distinct speed values that machine i can take. So, the computation of 
the work curve takes 0{l ■ poly(n, m)) = 0(poly(n, m)). □ 

Truthful approximation mechanisms. 

Theorem 9. There exists a truthful polynomial-time (2 + e)- approximation 
mechanism for QljCmax when (i) all speeds are integer bounded above by some 
constant M, or (ii) speeds are divisible. Moreover, the mechanism satisfies vol- 
untary participation and the payments can be computed in polynomial time. 

Theorem 10. For every e > 0, there exists a truthful polynomial-time (4 + 
e)- approximation mechanism for QUOmax- Moreover, the mechanism satisfies 
voluntary participation and the payments can be computed in polynomial time. 
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Abstract. A set of n independent jobs is to be scheduled without pre- 
emption on m identical parallel machines. For each job j, a so called 
diffuse adversary chooses the distribution Fj of the random processing 
time Pj from a certain class of distributions Fj. The scheduler is given 
the expectation Hj = E[Pj], but the actual duration is not known in 
advance. A positive weight Wj is associated with each job j and all jobs 
are ready for execution at time zero. The objective is to minimise the 
expected competitive ratio maxFgj^E where Cj denotes the 

completion time of job j and OPT the offline optimum value. The sched- 
uler determines a list of jobs, which is then scheduled in non-preemptive 
static list policy. 

We show a general bound on the expected competitive ratio for list 
scheduling algorithms, which holds for a class of so called new-better- 
than-used processing time distributions. This class includes, among oth- 
ers the exponential distribution. Our bound depends on the probability 
of any pair of jobs being in the wrong order in the list of an arbitrary list 
scheduling algorithm, compared to an optimum list. As a special case, we 
show that the so called WSEPT algorithm achieves E ^ 3— A 

for exponential distributed processing times. 



1 Introduction 

Scheduling problems were among the first combinatorial optimisation problems 
to be studied. The usually considered worst-case perspective often does not 
reflect, that even a scheduling algorithm with bad worst-case behaviour may 
perform rather well in practical applications. A natural step to overcome this 
drawback is to consider stochastic scheduling, i.e., to interpret problem data as 
random variables and to measure algorithm performance by means of expected 
values. In this paper, we study the so called expected competitive ratio of online 
list scheduling algorithms for stochastic completion time scheduling problems. 



Model, Problem Definition, and Notation. Consider a set J = {1, 2, . . . , n} 

of n independent jobs that have to be scheduled non-preemptively on a set 
M = {1,2,..., to} of TO identical parallel machines. For each job j, a so called 
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dijfuse adversary (see [12]) chooses the distribution Fj of the random processing 
time Pj > 0 out of a certain class of distributions Fj. Yet, we assume processing 
times being independent. The scheduler is given the expectation p,j = E [Pj] of 
each job j, but the actual realisation pj is only learned upon job completion. A 
positive weight Wj is associated with each job j € J and all jobs are ready for 
execution at time zero. Every machine can process at most one job at a time. 
Each job can be executed by any of the machines, but preemption and delays 
are not permitted. The completion time Cj of a job j € J is the latest point in 
time, such that a machine is busy processing the job. 

In list scheduling, jobs are processed according to a priority list. For numerous 
deterministic and stochastic problems, list scheduling strategies are known to be 
optimal, see e.g., [14]. This is especially true for non-preemptive and non-delay 
scheduling, since there, any schedule can be described by a list. 

Thus, we restrict ourselves to list scheduling and consider the following 
model. A so called online list scheduling algorithm is given weight Wj and mean 
jij for all j € J and based on that information, at time zero deterministically 
constructs a permutation tt of J which is called a static list. This list is then 
scheduled in the following policy: whenever a machine is idle and the list is not 
empty, the job at the head of the list is removed and processed non-preemptively 
and without delay on the idle machine (with least index) . Notice that the actual 
realisations of processing times are learned only upon job completion, i.e., the 
list is constructed offline, while the schedule is constructed online. 

Once a realisation p = [pi,p 2 , . . . ,Pn) of job processing times is fixed, this 
policy yields a realisation of the random variable TWC(7 t) = '^jfzjWjCj, which 
denotes the total weighted completion time for list tt. Thus, for any realisation 
of job processing times, an offline optimum list tt* is defined by 

OPT(p) = TWC(7 t*) = min{TWC(7r) : tt is a permutation of J}. (1) 



This yields the random variable OPT of the minimum value of the objective 
function for the random processing time vector P = {Pi, P 2 , . . . , Pn). Let ALG 
be an online list scheduling algorithm and let tt denote the list produced by it 
on input fj. = (^ 1 , ^ 2 , • ■ • , Mn) and w = {w\,W 2 , . ■ . , w„). We define the random 
variable ALG = TWG(7 t) as the total weighted completion time achieved by the 
algorithm ALG. It is important to note that any online list scheduling algorithm 
deterministically constructs one fixed list for all realisations, while the optimum 

list may be different for each realisation. 

AT G 

For any algorithm ALG, the ratio 0pY defines a random variable that mea- 
sures the relative performance of that algorithm compared to the offline opti- 
mum. We may thus define the expected competitive ratio of an algorithm ALG 

by 



i?(ALG, F) = max 




ALG 

OPT 



: F GF 



where the job processing time distributions F = {Fi, F 2 , . . . , Fn) are chosen 
by a diffuse adversary from a class of distributions F = {Fi,F\, . . . ,Fn). The 




622 A. Souza and A. Steger 



objective is to minimise the expected competitive ratio, and thus an algorithm 
is called competitive optimal if it yields this minimum over all algorithms. 

In the standard classification scheme by Graham, Lawler, Lenstra, and Rin- 
nooy Kan [8], this completion time scheduling problem is denoted 



P\P3 



Fjifij) G Fj I maxE 



OPT 



Our model can be seen as a hybrid between stochastic scheduling models 
(see [14]) and competitve analysis (see [1]), since it comprises important aspects 
of them both. 

As done in competitive analysis, our model relates the performance of an 
algorithm to the offline optimum on each instance. But rather than taking the 
maximum over all instances, we take the average over all instances weighted 
with the adversarial distribution. Notice that we are in the diffuse adversary 
model introduced by Koutsoupias and Papadimitriou [12], since our adversary 
is allowed to choose the distribution of problem relevant data out of a certain 
class of distributions. 

The similarities to stochastic scheduling are that processing times are dis- 
tributed according to a probability distribution, and that the number n of jobs, 
their weights w and most importantly their expected durations p, are known. 
The most important difference is that in stochastic scheduling, the optimum is 
not defined as the offline optimum, but as the policy that, given w and p only, 
minimises the expected total weighted completion time. 



Previous Work. Several deterministic completion time scheduling problems 
can be solved in polynomial time. Smith [22] shows that scheduling jobs in order 
of non-decreasing processing time and weight ratio (WSPT) is optimal for the 
single machine problem 1 1 | WjCj- For the unweighted problem P \ \ Cj 
on identical parallel machines, the optimality of the shortest processing time 
first (SPT) strategy is shown by Conway, Maxwell, and Miller [5]. 

In contrast, Bruno and Sethi [3] establish that the problem Pm \ \ WjCj 
is AfP-hard in the ordinary sense for constant m > 2 machines, while Sahni [19] 
shows that it admits an FPTAS. For m considered part of the input, the prob- 
lem P I I WjCj is AfP-hard in the strong sense [7]. However, Kawaguchi and 
Kyan [11] establish that WSPT achieves |(1 -I- v^) approximation ratio. An 
exact algorithm is given by Sahni [19], and Skutella and Woeginger [25] estab- 
lish a PTAS. Several constant factor approximations are known for variants of 
the problem, see e.g., Phillips, Stein, and Wein [15,16], Schulz [20] Hall, Schulz, 
Shmoys, and Wein [9], and Skutella [21]. 

Turning to stochastic scheduling, the expected value of the objective function 
to a deterministic problem is a natural choice as an objective for the probabilistic 
counterpart. Thus, in that preformance measure, an algorithm ALG is considered 
optimal if it minimises the value E [ALG] = ^^KLG{x)f{x)dx, where ALG(x) 
denotes the value of the objective function achieved by ALG on an instance x 
with density f{x). 
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Apparently, models that are A/”P-hard in a deterministic setting sometimes 
allow a simple priority policy to be optimal for the probabilistic counterpart. 
For example, scheduling jobs in order of non-decreasing expected processing time 
(SEPT) is known to be optimal for many problems with the objective E Cj , 

see e.g., Rothkopf [18], Weiss and Pinedo [26], Bruno, Downey, and Frederick- 
son [2], Kampke [10], and Weber, Varaiya, and Walrand [27]. Moreover, for the 

problem l\Pj ~ Stoch(/ij) |E '^jWjCj , scheduling jobs in non-decreasing 
order of ratio (WSEPT) is optimal in non-preemptive static and dynamic 
policies, see e.g., [14]. By using LP relaxations, Mohring, Schulz, and Uetz [13] 
show E [WSEPT] < (2— ;^)E [OPT], for several variants of the scheduling prob- 

, where OPT denotes an optimum policy. 



lem P\Pj ^ Stoch {^j) | E WjCj 



One property of the performance measure E [ALG] is that instances x with 
small value ALG(a:) tend to be neglected since they contribute few to the overall 
expected value. Hence, in this measure, algorithms are preferred that perform 
well on instances x with large optimum value OPT(a:). It depends on the appli- 
cation if such behaviour is desireable, but if one is interested in algorithms that 
perform well on “many” instances, this measure may seem inappropriate. 



ALG 

OFT 



seems to be interesting for the 



Regarding this problem, the measure E 

following intuition. The ratio relates the value of the objective function 

achieved by some algorithm ALG to the optimum OPT on the instance x. Thus, 
the algorithm performs well on instances that yield small ratio, and fails on 
instances with large ratio. Hence, if for “most” instances a small ratio is attained, 
the “few” instances with large ratio will not increase the expectation drastically. 

However, it appears that in the context of stochastic scheduling, the measure 



E 



ALG 

OFT 



has only been considered by Goffman and Gilbert [4] and in the recent 
work by Scharbrodt, Schickinger, and Steger [23,24]. 

The former article [4] is concerned with the makespan scheduling prob- 



lems P I Pj 



OFT 



and P I Pi 



Exp (A) |E 
preemptive static list policy. 

In the latter papers [23,24], the problem P\Pj 



Uni (0, 1) I E 



OFT 



in non- 



Stoch(/ij) |E 



OFT 



IS 



considered for non-preemptive static list policy. The main result is that the 



SEPT algorithm yields E 



SEPT 



"OFT 



= 0(1) for identical parallel machines under 



relatively weak assumptions on job processing time distributions. 



Our Results. We introduce the class of distributions that are new-better- 
than-used in expectation relative to a function h (NBUEi^), which generalises 
the new-better-than-used in expectation (NBUE) class. The NBUEQprp class 
comprises the exponential, geometric, and uniform distribution. 
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Allowing the adversary to choose NBUEQprj^ processing time distributions, 
we derive bounds to online list scheduling algorithms for the problem 



P\Pj 



P(h) e NBUEqpt I maxE 



OPT 



Our analysis depends on a quantity a which is an upper bound to the probability 
for any pair of jobs being in the wrong order in a list of an arbitrary online list 
algorithm ALG, compared to an optimum list. 

Theorem 2 states that i?(ALG, NBUEQpp) < holds for the single ma- 
chine problem. Gorollary 2 claims i?(ALG, NBUEQpp) < + 1 — P for m 

identical parallel machines. These results reflect well the intuition that an algo- 
rithm should perform the better, the lower its probability of sequencing jobs in 
a wrong order. 



As a special case, we show that the WSEPT algorithm yields E 



rWSEPT 

OPT 



< 



3 — ^ for m identical parallel machines and exponential distributed processing 
times. Simulations empirically demonstrate tightness of this bound. 



2 New-Better-Than-Used Distributions 



Having to specify a class of distributions open to the diffuse adversary, we gen- 
eralise the class of distributions that are new-hetter-than-used in expectation 
(NBUE). The concept of NBUE random variables is well-known in reliability 
theory [6], where it is considered as a relatively weak assumption. NBUE dis- 
tributions are typically used to model the aging of system components, but 
have also proved useful in the context of stochastic scheduling. For the problem 
P I v^Var [Pj] < E [Pj] \ E W. WjC^ the bound E [WSEPT] < (2 - ^)E [OPT] 



of Mohring, Schulz, and Uetz [13] holds for NBUE processing time distributions 
as an important special case. In addition, Pinedo and Weber [17] give bounds 
for shop scheduling problems assuming NBUE processing time distributions. 

A random variable X > 0 is NBUE if E [X — t | X > t] < E [X] holds for all 
t > 0, see e.g., [6]. Examples of NBUE distributions are uniform, exponential. 
Erlang, geometric, and the Weibull distribution (with shape parameter at least 
one). 

Let X denote a random variable taking values in a set V C K)]' and let 
h{x) > 0 be a real-valued function defined on V. The random variable X > 0 is 
new-hetter-than-used in expectation relative to h (NBUEji) if 



E 



X-t 



X > t 



< E 



X 

W) 



(2) 



holds for all t €V, provided these expectations exist. 

As NBUEi variables are NBUE, the NBUE/j concept is indeed a generalisa- 
tion of NBUE. Now we establish several general properties of NBUE^i distribu- 
tions. 
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Lemma 1. If X is NBUE/j and a > 0, then 



aX — t 


aX > t 


< E 


■ aX ■ 


[ h{X) 


[h{X)\ 







holds for all t G V. 



It is natural to extend the concept of NBUEji distributions to functions h 
that have more than one variable. Let X denote a random variable taking values 
in a set y C Mq , let y G IE C for fc G N and let h{x, y) > 0 be a real- valued 
function defined on (E, W). The random variable X > 0 is NBUEji if 



E 



X-t 
h{X, y) 



X > t 



< E 



h{X,y) 



holds for alH G y and all y G W, provided these expectations exist. 



(3) 



Lemma 2. Let X be NBUE/j and let Y he a random vector taking values in W 
independently of X, then 



E 



X-t 

h{X,Y) 



X > t 



< E 



h{X,Y) 



Let g{y) be a function defined on W taking values in V. Since (3) holds for 
all t gV it holds especially for t = g{y). 



Lemma 3. Let X be NBUE^,, let Y he a random vector taking values in W 
independently of X and let g{y) he a function defined on W taking values in V, 
then 



E 



X-g{Y) 

h{X,Y) 



X > g{Y) 



< E 



h{X,Y) 



In fact, exponential, geometric, and uniform distributed random variables 
are NBUE;, for non-decreasing functions h. We give a proof for the exponential 
distribution. 



Lemma 4. If X ^ Exp (A) and h{x,y) > 0 is non- decreasing in x and y, then 
X is NBUE,,. 



Proof. For all s > 0 we have fx\x>t{t + s) = fx{s) since X has memoryless 
density fx- dks h is non-decreasing in x and y it holds that h{t s,y) > h{s, y) 
for t > 0. We therefore obtain 



E 



' X-t 
h{X,y) 



X > t 



x-t 



L=t h{x,y) 

t + s-t 



fx\x>t{x)dx 



< 



s=0 + S 

f°° s 
s=o Hs,y) 



—^fx\x>t(t + s)ds 
X 



fx(s)ds = E 



HX,y)\ 



which proves the lemma. 



□ 
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3 The Expected Competitve Ratio for Weighted 
Completion Time Scheduling 



From now on, we allow the diffuse adversary to choose NBUEQprj^ processing 
time distributions in the following way: all jobs fall into the same class of distri- 
bution, e.g., they are all exponential distributed, but the parameter, and thus the 
mean fXj of each individual job j is arbitrary. We denote this degree of freedom 
by Pj ~ G NBUEqpp and consider the problem 



P\Pj 



F{nj) G NBUEqpt I maxE 



Ei WjCj 
OPT 



for online list scheduling against a diffuse adversary. In Section 3.1 the single 
machine case is studied, and the results are generalised to identical parallel 
machines in Section 3.2. 

For all j,k £ J we define the indicator variable for the event that the 
jobs j and k being scheduled on the same machine. It is easily observed that for 
any list tt and job j the random completion time satisfies Cj = Efc<’rj PkMj,k, 
where k j denotes that job k is not after job j in the list tt. 



3.1 Single Machine Scheduling 

A list TT is called a weighted shortest processing time first (WSPT) list (also 
known as Smith’s ratio rule [14]) if the jobs are in non-decreasing order of pro- 
cessing time and weight ratio, i.e., 

for j k. (4) 

Wj Wk 

It is a well-known fact in scheduling theory, see e.g., [22,14] that WSPT 
characterises the offline optimum for single machine scheduling. 

Bounding the Expected Competitve Ratio. Recall that Mj^k takes the 
value one if jobs j and k are scheduled on the same machine, which is trivially 
true in single machine scheduling. Thus, TWC(7 t) can be rearranged to the more 
convenient form 

TWC(7t) = WjCj PkMj,k = ^k- 

jGJ j&J k<”j j&J k>^j 

For all j, k £ J, we define the random variable Aj^k = WkPj — WjPk and for 
any fixed list tt the indicator variable 

f 1 if Aj^k > 0 and k j 

yi.j k — \ 

’ I 0 otherwise. 

The intuition behind is that the variable takes the value one if the jobs 
j and k are scheduled in the wrong order in a list produced by an algorithm, 
compared to an optimum list. 
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Theorem 1. For any list tt it holds that 

TWC(7t) = OPT + E E 

ie J k>^j 

The proof is omitted due to space limitations, but the strategy is as follows: 
for any processing time vector p, the list tt is inductively rearranged into an 
optimum list tt*, by a sequence of reorderings. In each reordering, the variables 
Xj^k and Aj^k are used to record by how much the total weighted completion 
time decreases. 



Theorem 2. Let ALG he any online list scheduling algorithm for the problem 



l\Pj ^ F{pj) G NBUEqpt I maxE 

t 



OPT 



//Pr [Xj^k = 1] < a < 1 holds for all j k in all ALG lists tt, then 

i?( ALG, NBUEqpt) < 

Proof Let tt denote the fixed ALG list for expected processing times p and 
weights w. By Lemma 3 and Lemma 1 we have that 



E 



Aj^k 

OPT 


Xj,k = 1 


= E 


WkPj - WjPk 
OPT 


WkPj >WjPk 


< E 


WkPj 

OPT 



(5) 



holds for all NBUEqpt processing time distributions. Theorem 1 and linearity 
of expectation establish 



E 



ALG 

OPT 



= E 



OPT 



= i+EEe 

jG J k>^j 



OPT 



By conditioning on Xj^k = 1, application of (5) and by Pr [Xj^k = 1] < a for all 
j k we obtain 

'ALG1 _ r 



E 



OPT 



= 1 + EE Pr [Xj,k = 1] E 

jeJ k>‘"j 



j,k 



^i+«IEEe 

yjeJ k>”j 



WkPj 

OPT 



OPT 
= 1 + oE 



Xj,k = 1 



ALG 

OPT 



Finally, rearranging the inequality and a < 1 completes the proof. 
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Analysis of the WSEPT Algorithm. Now we consider the popular WSEPT 
list scheduling algorithm, and calculate the expected competitve ratio for expo- 
nential distributed job processing times, i.e., the adversary commits to exponen- 
tial distribution. 

A list 7T is called a weighted shortest expected processing time first (WSEPT) 
list, if scheduling is done according to non-decreasing expected processing time 
and weight ratio, i.e., 

for j (0) 

Wj Wk 

The random variable WSEPT = TWC(7 t) defines the total weighted completion 
time for WSEPT lists tt. Notice that WSEPT is an online list scheduling algo- 
rithm since WSEPT lists can be determined with the knowledge of the weights 
and expected processing times, rather than their realisations. 

A standard interchange argument, analogous to [24], proves Lemma 5 which 
states the competitive optimality of WSEPT for single machine scheduling with 
arbitrary processing time distributions. 



Lemma 5. An online list scheduling algorithm is competitive optimal for the 



scheduling problem 1 1 ~ Stoch {p,j) \ E 



12 j wjCj 

OPT 



if and only if it is WSEPT. 



In practical applications, processing times are often modelled by exponen- 
tial distributed random variables. Thus the bound obtained in Theorem 2 is of 
particular interest in this special case. 



Corollary 1. The WSEPT algorithm for the stochastic scheduling problem 






Exp |E 



Ej WjCj 

OPT 



yields E 



WSEPT 

OPT 



< 2 . 



Proof. Observe that the function OPT(p) is non-decreasing in p. Hence, by 
Lemma 4, exponential distributed random variables are NBUEQprj^. It is thus 
sufficient to prove Pr [A^ ^ = 1] < | for j k in all WSEPT lists tt. As 
WkPj ~ Exp {{wkPj)~^) and wjPk ~ Exp (^{wjPk)~^) we have 



Pr [A,. fc 



1] = Pr [Aj^k > 0] = Pr [wkPj > WjPk] 






'^kPj 



-dsdt = 



WkPj 



/t=0 WjPk Js=t WkPj 



WkPj + WjPk 



1 

< - 
- 2 



because j k implies WkPj < WjPk by the WSEPT ordering (6). Application 
of Theorem 2 completes the proof. □ 



Simulations of exponential distributed processing times empirically demon- 
strate tightness of Corollary 1. 




The Expected Competitive Ratio for Weighted Completion Time Scheduling 629 



3.2 Scheduling Identical Parallel Machines 

Now we generalise our results to online list scheduling on m identical parallel 
machines. 

Let ALG*-^^ and denote the values of the objective function achieved 

by the algorithm ALG and the offline optimum, respectively, for the processing 
time vector P on (. identical parallel machines. Moreover, the completion time 
vector for a list tt on £ identical parallel machines is denoted by 

The following lemmata are needed to reduce scheduling on m parallel ma- 
chines to single machine scheduling. Similar approaches and proofs can be found 
in [9] and [13,15]. 



Lemma 6. Let tt he any joh list for non-preemptive list scheduling and let P he 
processing times, then 



(j{rn) 



< ic<" 

m ■’ 




P. 



Lemma 7. For any processing time vector P and the problem P \ \ wjCj we 
have 

OPt(”") > — OPT^^). 
m 



Theorem 3. Let ALG he any online list scheduling algorithm for the problem 
P\Pj ^ Stoch {fXj) I E 

then 



'EjWjCj 

OPT 



E 



ALG^" 



OPT 



(m) 



< E 



ALG^^) 



OPT 



( 1 ) 



-hl- 



1 



Proof. Lemma 6 and Lemma 7 establish 



C. 



(m) 



< 



c. 



( 1 ) 



OPT^™) mOPT^"*) 



+ ( 1 -- 



Pj 



< 



C. 



( 1 ) 



m J OPT^™^ OPT^^^ 



+ ( 1 -- 



P. 



m J OPT*^™^ ' 



Thus we have 



Qp'p(m) 



OPTIP 






m J OPTi"*) 



< 



OPt(P 



by OPT^™^ > ^j^jWjPj. Taking expectations completes the proof. 



□ 
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Corollary 2. Let ALG he any online list scheduling algorithm for the problem 

'EjWjCy 



P\Pj^ F{g,j) G NBUEqpt I maxE 

-T ^J- 



OPT 



7/Pr = 1] < q; < 1 holds for all j k in all ALG lists tt, then 

i?(ALG,NBUEoPT) < + 1 - — ■ 

^ 1 — a TO 



Corollary 3. 



The WSEPT algorithm for the stochastic scheduling problem 



P\Pj 



1) |E 



OPT 



yields E 



' WSEPT 

OPT 



< 3- 



m ’ 



Acknowledgement. The authors thank the anonymous referees for references 
and suggestions which helped improving the paper. 



References 

1. Alan Borodin and Ran El-Yaniv. Online Computation and Competitive Analysis. 
Cambridge University Press, 1998. 

2. J. Bruno, P. Downey, and G. N. Frederickson. Sequencing Tasks with Exponential 
Service Times to Minimize the Expected Flow Time or Makespan. Journal of the 
ACM, 28(1):100 - 113, 1981. 

3. E. C. Bruno, Jr. and R. Sethi. Scheduling independent tasks to reduce mean 
finishing time. Communications of the ACM, 17:382 - 387, 1974. 

4. Edward G. Goffman, Jr. and E. N. Gilbert. On the Expected Relative Performance 
of List Scheduling. Operations Research, 33(3):548 - 561, 1985. 

5. R. W. Conway, W. L. Maxwell, and L. W. Miller. Theory of Scheduling. Addison- 
Wesley Publishing Company, Reading, MA, 1967. 

6. I.R. Gertsbakh. Statistical Reliability Theory. Marcel Dekker, Inc., New York, NY, 
1989. 

7. M. R. Garey and D. S. Johnson. Computers and Intractability - A Cuide to the 
Theory of NP-Completeness. Freeman, San Francisco, CA, 1979. 

8. R. L. Graham, E. L. Lawler, J. K. Lenstra, and A. H. G. Rinnooy Kan. Opti- 
mization and approximation in deterministic sequencing and scheduling theory: a 
survey. Annals of Discrete Mathematics, 5:287 - 326, 1979. 

9. Leslie A. Hall, Andreas S. Schulz, David B. Shmoys, and Joel Wein. Scheduling to 
minimize average completion time: Off-line and on-line approximation algorithms. 
Mathematics of Operations Research, 22:513 - 544, 1997. 

10. Thomas Kampke. On the optimality of static priority policies in stochastic schedul- 
ing on parallel machines. Journal of Applied Probability, 24:430 - 448, 1987. 

11. Tsuyoshi Kawaguchi and Seiki Kyan. Worst Case Bound of an LRF Schedule for 
the Mean Weighted Flow-Time Problem. SIAM Journal on Computing, 15(4):1119 
- 1129, 1986. 




The Expected Competitive Ratio for Weighted Completion Time Scheduling 631 



12. Elias Koutsoupias and Christos Papadimitriou. Beyond Competitive Analysis. 
Proceedings of the 35th Annual Symposium on Foundations of Computer Science 
(FOCS ’94), pages 394 - 400, 1994. 

13. Rolf H. Moring, Andreas S. Schulz, and Marc Uetz. Approximation in Stochastic 
Scheduling: The Power of LP-based Priority Rules. Journal of the ACM, 46:924 - 
942, 1999. 

14. Michael Pinedo. Scheduling - Theory, Algorithms, and Systems. Prentice-Hall, 
Englewood Cliffs, 1995. 

15. Cynthia Phillips, Clifford Stein, and Joel Wein. Scheduling Jobs That Arrive Over 
Time. In Proceedings of the Workshop on Algorithms and Data Structures 
(WADS ’95), volume 955 of Lecture Notes in Computer Science, pages 86 - 97. 
Springer Verlag, 1995. 

16. Cynthia A. Phillips, Cliff Stein, and Joel Wein. Minimizing average completion 
time in the presence of release dates. Mathematical Programming, 82:199 - 223, 
1998. 

17. Michael Pinedo and Richard Weber. Inequalities and bounds in stochastic shop 
scheduling. SIAM Journal on Applied Mathematics, 44(4) :867 - 879, 1984. 

18. Michael H. Rothkopf. Scheduling with Random Service Times. Management Sci- 
ence, 12:707 - 713, 1966. 

19. Sartaj K. Sahni. Algorithms for Scheduling Independent Tasks. Journal of the 
ACM, 23:116 - 127, 1976. 

20. Andreas S. Schulz. Scheduling to Minimize Total Weighted Completion Time: Per- 
formance Guarantees of LP-Based Heuristics and Lower Bounds. In Proceedings 
of the International Conference on Integer Programming and Combinatorial Op- 
timization, volume 1084 of Lecture Notes in Computer Science, pages 301 - 315, 
Berlin, 1996. Springer Verlag. 

21. Martin Skutella. Semidefinite relaxations for parallel machine scheduling. Proceed- 
ings of the 39th Annual Symposium on Foundations of Computer Science (FOCS 
’98), pages 472 - 481, 1998. 

22. W. E. Smith. Various Optimizers for Single Stage Production. Naval Research 
Logistics Quarterly, 3:59 - 66, 1956. 

23. Thomas Schickinger, Marc Scharbrodt, and Angelika Steger. A new average case 
analysis for completion time scheduling. Proceedings of the 3fth Annual ACM 
Symposium on Theory of Computing (STOC ’02), pages 170 - 178, 2002. 

24. Thomas Schickinger, Marc Scharbrodt, and Angelika Steger. A new average case 
analysis for completion time scheduling. Journal of the ACM, accepted, 2003. 

25. Martin Skutella and Gerhard J. Woeginger. A PTAS for minimizing the total 
weighted completion time on identical parallel machines. Mathematics of Opera- 
tions Research, 25, 2000. 

26. Gideon Weiss and Michael Pinedo. Scheduling tasks with exponential service times 
on non-identical processors to minimize various cost functions. Journal of Applied 
Probability, 17:187 - 202, 1980. 

27. R. R. Weber, P. Varaiya, and J. Walrand. Scheduling jobs with stochastically or- 
dered processing times on parallel machines to minimize expected ffowtime. Journal 
of Applied Probability, 23:841 - 847, 1986. 




Effective Strong Dimension in Algorithmic 
Information and Computational Complexity 



Krishna B. Athreya^*, John M. Hitchcock^**, Jack H. Lutz^***, and 
Elvira Mayor domo^^ 

^ School of Operations Research and Industrial Engineering, Cornell University, 
Ithaca, NY 14853, USA and Departments of Mathematics and Statistics, Iowa State 
University, Ames, lA 50011, USA. kba@iastate.edu 
^ Department of Computer Science, University of Wyoming, Laramie, WY 82071, 
USA. jhitchco@cs.uwyo.edu 

® Department of Computer Science, Iowa State University, Ames, lA 50011, USA. 

lutz@cs . iastate . edu 

^ Departamento de Informatica e Ingenieria de Sistemas, Universidad de Zaragoza, 
50015 Zaragoza, SPAIN, elvira@posta.unizar.es 



Abstract. The two most important notions of fractal dimension are 
Hausdorff dimension, developed by Hausdorff (1919), and packing di- 
mension, developed independently by Tricot (1982) and Sullivan (1984). 

Both dimensions have the mathematical advantage of being dehned from 
measures, and both have yielded extensive applications in fractal geom- 
etry and dynamical systems. 

Lutz (2000) has recently proven a simple characterization of Haus- 
dorff dimension in terms of gales, which are betting strategies that gen- 
eralize martingales. Imposing various computability and complexity con- 
straints on these gales produces a spectrum of effective versions of Haus- 
dorff dimension, including constructive, computable, polynomial-space, 
polynomial-time, and finite-state dimensions. Work by several investiga- 
tors has already used these effective dimensions to shed significant new 
light on a variety of topics in theoretical computer science. 

In this paper we show that packing dimension can also be charac- 
terized in terms of gales. Moreover, even though the usual dehnition of 
packing dimension is considerably more complex than that of Hausdorff 
dimension, our gale characterization of packing dimension is an exact 
dual of - and every bit as simple as - the gale characterization of Haus- 
dorff dimension. 

Effectivizing our gale characterization of packing dimension pro- 
duces a variety of effective strong dimensions, which are exact duals 
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of the effective dimensions mentioned above. In general (and in analogy 
with the classical fractal dimensions), the effective strong dimension of a 
set or sequence is at least as great as its effective dimension, with equality 
for sets or sequences that are sufficiently regular. 

We develop the basic properties of effective strong dimensions and 
prove a number of results relating them to fundamental aspects of ran- 
domness, Kolmogorov complexity, prediction, Boolean circuit-size com- 
plexity, polynomial-time degrees, and data compression. Aside from the 
above characterization of packing dimension, our two main theorems are 
the following. 

1. If /3 = (/3o, /3i, • • . ) is a computable sequence of biases that are 
bounded away from 0 and R is random with respect to /3, then 
the dimension and strong dimension of R are the lower and upper 
average entropies, respectively, of /3. 

2. For each pair of A^-computable real numbers 0 < a < /3 < 1, there 
exists A G E such that the polynomial-time many-one degree of A 
has dimension a in E and strong dimension /3 in E. 

Our proofs of these theorems use a new large deviation theorem for 
self-information with respect to a bias sequence /3 that need not be con- 
vergent. 



1 Introduction 

Hausdorff dimension - a powerful tool of fractal geometry developed by Haus- 
dorff [8] in 1919 - was effectivized in 2000 by Lutz [17,18]. This has led to a spec- 
trum of effective versions of Hausdorff dimension, including constructive, com- 
putable, polynomial-space, polynomial-time, and finite-state dimensions. Work 
by several investigators has already used these effective dimensions to illumi- 
nate a variety of topics in algorithmic information theory and computational 
complexity [17,18,1,3,21,10,9,7,11,12,6]. (See [20] for a survey of some of these 
results.) This work has also underscored and renewed the importance of earlier 
work by Ryabko [22,23,24,25], Staiger [30,31,32], and Cai and Hartmanis [2] re- 
lating Kolmogorov complexity to classical Hausdorff dimension. (See Section 6 
of [18] for a discussion of this work.) 

The key to all these effective dimensions is a simple characterization of clas- 
sical Hausdorff dimension in terms of gales, which are betting strategies that 
generalize martingales. (Martingales, introduced by Levy [13] and Ville [36] have 
been used extensively by Schnorr [26,27,28] and others in the investigation of 
randomness and by Lutz [15,16] and others in the development of resource- 
bounded measure.) Given this characterization, it is a simple matter to impose 
computability and complexity constraints on the gales to produce the above- 
mentioned spectrum of effective dimensions. 

In the 1980s, a new concept of fractal dimension, called the packing dimen- 
sion, was introduced independently by Tricot [35] and Sullivan [33]. Packing 
dimension shares with Hausdorff dimension the mathematical advantage of be- 
ing based on a measure. Over the past two decades, despite its greater complexity 
(requiring an extra optimization over all countable decompositions of a set in 
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its definition) , packing dimension has become, next to HausdorfT dimension, the 
most important notion of fractal dimension, yielding extensive applications in 
fractal geometry and dynamical systems [4,5]. 

The main result of this paper is a proof that packing dimension can also be 
characterized in terms of gales. Moreover, notwithstanding the greater complex- 
ity of packing dimension’s definition (and the greater complexity of its behavior 
on compact sets, as established by Mattila and Mauldin [19]), our gale charac- 
terization of packing dimension is an exact dual of - and every bit as simple as 
- the gale characterization of Hausdorff dimension. (This duality and simplic- 
ity are in the statement of our gale characterization; its proof is perforce more 
involved than its counterpart for Hausdorff dimension.) 

Effectivizing our gale characterization of packing dimension produces for each 
of the effective dimensions above an effective strong dimension that is its exact 
dual. Just as the Hausdorff dimension of a set is bounded above by its packing 
dimension, the effective dimension of a set is bounded above by its effective 
strong dimension. Moreover, just as in the classical case, the effective dimension 
coincides with the strong effective dimension for sets that are sufficiently regular. 

After proving our gale characterization and developing the effective strong 
dimensions and some of their basic properties, we prove a number of results 
relating them to fundamental aspects of randomness, Kolmogorov complexity, 
prediction. Boolean circuit-size complexity, polynomial-time degrees, and data 
compression. Our two main theorems along these lines are the following. 

1. If J > 0 and /3 = (/3o,/3i, . . . ) is a computable sequence of biases with each 
(ii G [<5, 5 ], then every sequence R that is random with respect to (3 has 
dimension 



1 r 

dim(i?) = liminf — 

n—^oo fi • ^ 

i^O 



and strong dimension 



n— 1 



Dim(i?) = limsup — T-L{fii), 

n—¥oo n 



i=0 



where 'H(/Ji) is the Shannon entropy of Pi. 

2. For every pair of Zl^-computable real numbers 0 < a < /? < 1 there is a 
decision problem A G E such that the polynomial-time many-one degree of 
A has dimension a in E and strong dimension P in E. 

In order to prove these theorems, we prove a new large deviation theorem for 
the self-information log where /3 is as in 1 above. Note that /3 need not 

be convergent here. 

A corollary of theorem 1 above is that, if the average entropies ^ 'H{Pi) 
converge to a limit H{P) as n — >■ 00 , then dim(i?) = Dim(i?) = H{P). Since the 
convergence of these average entropies is a much weaker condition than the 
convergence of the biases /J„ as n — >■ 00 , this corollary substantially strengthens 
Theorem 7.7 of [18]. 
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Our remaining results are much easier to prove, but their breadth makes a 
strong prima facie case for the utility of effective strong dimension. They in some 
cases explain dual concepts that had been curiously neglected in earlier work, 
and they are likely to be useful in future applications. It is to be hoped that we 
are on the verge of seeing the full force of fractal geometry applied fruitfully to 
difficult problems in the theory of computing. 



2 Fractal Dimensions 



In this section we briefly review the classical definitions of some fractal dimen- 
sions and the relationships among them. Since we are primarily interested in 
binary sequences and (equivalently) decision problems, we focus on fractal di- 
mension in the Cantor space C. 

For each A: G N, we let Ak be the collection of all prefix sets A such that 
= 0- For each AT C C, we then define the families 

xc y 

w^A 

Bk{X) = {AGAk |(Vrc G A)C^ n X ^ 0} . 

If A G Ak{X), then we say that the prefix set A covers the set X. If A G Bk{X), 
then we call the prefix set A a packing of X. For X G C, s G [0, oo), and A: G N, 
we then define 



Ak{X) = {A&Ak 






sup 



Since H^{X) and Pf{X) are monotone in k, the limits 

H^{X) = lim Hf(X), P4(X) = lim Pf(X) 

k—¥oo k—¥oo 

exist, though they may be infinite. We then define 



P*(X) = inf ^P4(Xi) 



xc y X, 






z=0 



( 2 . 1 ) 



The set functions and P^ have the technical properties of an outer 
measure [4], and the (possibly infinite) quantities H‘^{X) and P®(X) are thus 
known as the s-dimensional Hausdorff (outer) cylinder measure of X and the 
s-dimensional packing (outer) cylinder measure of X, respectively. The set func- 
tion Pf^ is not an outer measure; this is the reason for the extra optimization 
(2.1) in the definition of the packing measure. 

Definition. Let X C C. 

1. The Hausdorff dimension of X is dimn(X) = inf{s G [0, oo)|iL®(X) = 0}. 

2. The packing dimension of X is dimp(X) = inf{s G [0, oo)|P'*(X) = 0}. 
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The proof of our main result uses a well-known characterization of packing 
dimension as a modified box dimension. For each X QC and n G N, let Nn{X) 
be the number of strings of length n that are prefixes of elements of S. Then the 
upper box dimension of X is dimB(AT) = limsup ^ logNn{X). 

n—^oo 

Box dimensions are over 60 years old, have been re-invented many times, and 
have been named many things, including Minkowski dimension, Kolmogorov en- 
tropy, Kolmogorov dimension, topological entropy, metric dimension, logarithmic 
density, and information dimension. Box dimensions are often used in practical 
applications of fractal geometry because they are easy to estimate, but they are 
not well-behaved mathematically. The modified upper box dimension 

dimMB(-^) = inf < supdimB(ATi) 

I * 

is much better behaved. (Note that (2.2), like (2.1), is an optimization over all 
countable decompositions of X.) In fact, the following relations are well-known 

[4]. _ 

Theorem 2.1. For all X C C, 0 < dimH(^) < dimMB(-^) = dimp(X) < 
dimB(AT) < 1. 

The above dimensions are monotone, i.e., X QY implies dim(Jf) < dim(K), 
and stable, i.e., dim(Xuy) = max{dim(A'), dim(K)}. The Hausdorff and packing 
dimensions are also eountably stable, i.e., dim(U“QXi) = sup{dim(Xi) |i G N}. 



x^\Jx, 



2=0 



( 2 . 2 ) 



3 Gale Characterizations 

In this section we review the gale characterization of Hausdorff dimension and 
prove our main theorem, which is the dual gale characterization of packing di- 
mension. 

Definition. Let s G [0,oo). 

1. An s-supergale is a function d : {0, 1}* — > [0, oo) that satisfies the condition 

d(w) > 2“®[(i(ri;0) -I- d(iul)] (3-1) 

for all w G {0, 1}*. 

2. An s-gale is an s-supergale that satisfies (3.1) with equality for all w. 

3. A supermartingale is a 1-supergale. 

4. A martingale is a 1-gale. 

Intuitively, we regard a supergale d as a strategy for betting on the successive 
bits of a sequence S' G C. More specifically d{w) is the amount of capital that 
d has after betting on the prefix w of S. If s = 1, then the right-hand side of 
(3.1) is the conditional expectation of d{wb) given that w has occurred (when b 
is a uniformly distributed binary random variable). Thus a martingale models a 
gambler’s capital when the payoffs are fair. (The expected capital after the bet 
is the actual capital before the bet.) In the case of an s-gale, if s < 1, the payoffs 
are less than fair; if s > 1, the payoffs are more than fair. 

We now define two criteria for the success of a gale or supergale. 
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Definition. Let d be an s-supergale, where s G [0,oo). 

1. We say that d succeeds on a sequence S' G C if 

limsup d{S[0..n — 1]) = oo. 

n—^oo 

The success set of d is S°°[d] = {S G C\d succeeds on S}. 

2. We say that d succeeds strongly on a sequence S G C if 

liminf d(S[0..n — 1]) = oo. 

n— >-oo 

The strong success set of d is S“j.[d] = {S G C|d succeeds strongly on S}. 

We have written conditions (1) and (2) in a fashion that emphasizes their du- 
ality. Condition (1) says simply that the set of values d{S[0..n— 1]) is unbounded, 
while condition (2) says that d(S[0..n — 1]) — >■ oo as n — >■ oo. 

Notation. Let X C C. 

1. G{X) is the set of all s G [0,oo) for which there exists an s-gale d such that 
X C s^[d]. 

2. is the set of all s G [0,oo) for which there exists an s-gale d such 
that X C S'“[d]. 

3. G(X) is the set of all s G [0, oo) for which there exists an s-supergale d such 
that X C S°°[d]. 

4. is the set of all s G [0,oo) for which there exists an s-supergale d 
such that X C 

Note that s' > s G t/(AT) implies that s' G G{X), and similarly for the classes 
g(^X), and The following fact is also clear. 

Observation 3.1. For all X CC, G(X) = G(X) and G^^'^{X) = 

For Hausdorff dimension, we have the following known fact. 

Theorem 3.2. (Gale Characterization of Hausdorff Dimension - Lutz [17]) For 
all X C C, dimH(^) = inf G{X). 

Our main result is the following dual of Theorem 3.2. 

Theorem 3.3. (Gale Characterization of Packing Dimension) For all X C C, 
dimp(A:) = inf^?"*"'(A:). 

By Observation 3.1, we could equivalently use G{X) and t/®*'^(AT) in Theorems 
3.2 and 3.3, respectively. 
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4 Effective Strong Dimensions 

Theorem 3.2 has been used to effectivize Hausdorff dimension at a variety of 
levels. In this section we review these effective dimensions while using Theorem 
3.3 to develop the dual effective strong dimensions. 

We define a gale or supergale to be constructive if it is lower semicomputable. 
The definitions of finite-state gamblers and finite-state gales appear in [3] . For the 
rest of this paper, A denotes one of the classes all, comp, p,pspace,p 2 ,p 2 space, 
etc. that are defined in [17]. 

For each F G {constr, Z\, FS} and AT C C, we define the sets Qr{X), 

Qr{X), and just as the classes G{X), G{X), and were 

defined in Section 3, but with the following modifications. 

(i) If T = constr, then d is required to be constructive. 

(ii) If T = A, then d is required to be Z\-computable. 

(iii) In t/Fs(-^) and Gps(X), d is required to be finite-state. 

(iv) C/Fs(-^) and C/pg(A') are not defined. 

The following effectivizations of Hausdorff and packing dimension are moti- 
vated by Theorems 3.2 and 3.3. 

Definition. Let X C C and S G C. 

1. [18] The constructive dimension of X is cdim(Al) = inf f/constr(W). 

2. The constructive strong dimension of X is cDim(X) = inf 

3. [18] The dimension of S is dim(S') = cdim({S'}). 

4. The strong dimension of S is Dim(S') = cDirndS”}). 

5. [17] The A-dimension of X is dim/ifAl) = inf Ga{X). 

6. The A-strong dimension of X is Dim/i(Ar) = inf Gp’^iX). 

7. [17] The dimension of X in R{A) is dim(if |i?(Z\)) = dim_ 4 (Jf fl R{A)). 

8. The strong dimension of X in R{A) is Dim(X|i?(Z\)) = Dim/i(X fl R{A)). 

9. [3] The finite-state dimension of X is dimFs(-^) = inf 1 /fs(-^)- 

10. The finite-state strong dimension of X is DimFs(-^) = inf Gf^{X). 

11. [3] The finite-state dimension of S is dimFs(S') = dimFs({5'}). 

12. The finite-state strong dimension of S is DimFs(<S') = DimFs({<5'}). 

In parts 1,2,5, and 6 of the above definition, we could equivalently use the 
“hatted” sets ^constr(-’f), ^constr(^)> Ga{X), and in place of their un- 

hatted counterparts. In the case of parts 5 and 6, this follows from Lemma 4.7 
of [17]. In the case of parts 1 and 2, it follows from the main theorem in [12] 
(which answered an open question in [18], where f/constr(-^) was in fact used in 
defining cdim(Ai)). 

The polynomial-time dimensions dimp(Ai) and Dimp(X) are also called the 
feasible dimension and the feasible strong dimension, respectively. The notation 
dimp(Ai) for the p-dimension is all too similar to the notation dimp(Ai) for the 
classical packing dimension, but confusion is unlikely because these dimensions 
typically arise in quite different contexts. 

Observations 4.1. 1. Each of the dimensions that we have defined is mono- 

tone (e.g., X CY implies cdim(X) < cdim(F)J. 
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2. Each of the effective strong dimensions is hounded below by the corresponding 
effective dimension (e.g., cdim(X) < cDim(AT)J. 

3. Each of the dimensions that we have defined is nonincreasing as the effec- 
tivity constraint is relaxed (e.g., dimH(^) < cdim(X) < dimpspace(-^) < 
dimFs(-’i^)J- 

f. Each of the dimensions that we have defined is nonnegative and assigns C 
the dimension 1. 



5 Algorithmic Information 



In this section we present a variety of results and observations in which construc- 
tive and computable strong dimensions illuminate or clarify various aspects of 
algorithmic information theory. Included is our second main theorem, which says 
that every sequence that is random with respect to a computable sequence of 
biases j3i G [<i, 1/2] has the lower and upper average entropies of (/?o, /3i, • ■ • ) 
as its dimension and strong dimension, respectively. We also present a result in 
which finite-state strong dimension clarifies an issue in data compression. 

Mayordomo [21] proved that for all S' G C, 



dim(S) = liminf 

n—^oo 



AT(S[0..n- 1]) 



(5.1) 



where K{w) is the Kolmogorov complexity of w [14] . Subsequently, Lutz [18] used 
termgales to define the dimension dim(w) of each (finite!) string w G {0, 1}* and 
proved that 



dim(S) = liminf dim(S[0..n — 1]) (5.2) 

n— >-oo 

for all S G C and 

^(■u;) = |'u;|dim(r(;) ± 0(1) (5.3) 

for all w G {0, 1}*, thereby giving a second proof of (5.1). The following theorem 
is a dual of (5.2) that yields a dual of (5.1) as a corollary. 

Theorem 5.1. Eor all S G C, Dim(S) = limsup dim(S[0..n — 1]). 

n— >-oo 



Corollary 5.2. Eor all S G C, Dim(S) = limsup ^ 

By Corollary 5.2, the “upper algorithmic dimension” defined by Tadaki [34] 
is precisely the constructive strong dimension. 

As the following result shows, the dimensions and strong dimensions of se- 
quences are essentially unrestricted. 

Theorem 5.3. Eor any two real numbers 0 < a < (3 < 1, there is a sequence 
S € C such that dim(S') = a and Dim(S') = j3. 

We now come to the main theorem of this section. The following notation 
simplifies its statement and proof. 
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Notation. Given a bias sequence /3 = (/3q, /3i, . . . ), n G N, and S £ C, let 

.. n— 1 

H~{(3) = liminf iZ„(/3), 

n—^oo 

= limsupiZ„(/3). 

n—^oo 

We call H~{f3) and the lower and upper average entropies, respectively, 

of /3. 

Theorem 5.4. If 6 £ (0, |] and (3 is a computable bias sequence with each (3i £ 
[(5, ^], then for every sequence R £ RAND^, dim(i?) = H~{(3) and Dim(i?) = 
H+{(3). 

Theorem 5.4 says that every sequence that is random with respect to a suit- 
able bias sequence /3 has the lower and upper average entropies of f3 as its dimen- 
sion and strong dimension, respectively. We now describe the most important 
results that are used in our proof of Theorem 5.4. 

Notation. Given a bias sequence /3 = (/3q, Pi, ■ ■ ■), n G N, and S' G C, let 



L„(;3)(S) = log 



1 

^^(S[0..n 



1 ]) 



z=0 



where 

^,(S) = (1 - S[i\) log + -S'M log ^ 

for 0 < i < n. 

Note that i„(/3), ^o> ■ • • ) Cn-i are random variables with 

n— 1 n— 1 

EL„(/3) = 51 ^ nPi) = niZ„(/3). 

z=0 z=0 



The following large deviation theorem tells us that Ln{P) is very unlikely to 
deviate significantly from this expected value. 

Theorem 5.5. For each <5 > 0 and e > 0, there exists a £ (0, 1) such that, for 
all bias sequences /3 = {Pq,Pi, . . . ) with each Pi £ [i5, 1 — <5] and all n £ Z+, if 
L„(/3) and Hn{P) are defined as above, then 

P[|L„(/3)-niZ„(/3)| >en] < 2a", 

where the probability is computed according to . 

Lemma 5.6. If S > 0 and P is a computable bias sequence with each Pt £ [5, |], 
then cdim(RAND^) < H~{P) and cDim(RAND^) < iJ+(/3). 
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Corollary 5.7. If [3 is a computable sequence of coin-toss biases such that 
H{/3) = lim i7„(/3) G (Ojl); then every sequence R € C that is random with 

n—^oo 

respect to (3 is c-regular, with dim(i?) = Dim(i?) = H{(3). 

Note that Corollary 5.7 strengthens Theorem 7.6 of [18] because the conver- 
gence of Hn{(3) is a weaker hypothesis than the convergence of (3. 

Dai, Lathrop, Lutz, and Mayordomo [3] investigated the finite-state com- 
pression ration pYs{S), defined for each sequence S' G C to be the infimum, 
taken over all information-lossless finite-state compressors C (a model defined 
in Shannon’s 1948 paper [29]) of the (lower) compression ratio 



pc{S) = lim inf 

n—¥oo 



|C'(S[0..n 

n 



l])l 



They proved that 



Pfs(S) = dimps(S) 



(5.4) 



for all S G C. However, it has been pointed out that the compression ratio 
Pfs('S') differs from the one investigated by Ziv [37]. Ziv was instead concerned 
with the ratio i?ps(S) defined by 



flps(S) 



inflimsup i„f 

fceN ^_^oo GeCfc n 



where Ck is the set of all /c-state information-lossless finite-state compressors. The 
following result, together with (5.4), clarifies the relationship between /Ops(*S') and 
i?Fs(S'). 

Theorem 5.8. For all S G C, Rps{S) = Dimps(S'). 

Thus, mathematically, the compression ratios pfs(S) and i?ps(<S') are both 
natural: they are the finite-state effectivizations of the Hausdorff and packing 
dimensions, respectively. 



6 Computational Complexity 

In this section we prove our third main theorem, which says that the dimensions 
and strong dimensions of polynomial-time many-one degrees in exponential time 
are essentially unrestricted. Our proof of this result uses Theorem 5.5 and con- 
venient characterizations of p-dimension (due to Hitchcock [11]) and strong p- 
dimension (in the full version of this paper) in terms of feasible unpredictability. 
This theorem and its proof are motivated by analogous, but simpler arguments 
by Ambos-Spies, Merkle, Reimann and Stephan [1]. 

Theorem 6.1. For every pair of p>- computable real numbers x,y with 0 < x < 
y < I, there exists A G E such that 

dimp(degP (A)) = dim(degP (A)|E) = x 



and 



Dimp(degP (A)) = Dim(degP (A) |E) = y. 
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In light of Theorem 6.1, the following question concerning the relativized 
feasible dimension of NP is natural. 

Open Question. For which pairs of real numbers a,f3G [0, 1] does there exist 
an oracle A such that dimpA(NP'^) = a and DimpA(NP'^) = /3? 

Acknowledgment. The third author thanks Dan Mauldin for extremely useful 
discussions. 
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Abstract. We study a class of single-round, sealed-bid auctions for a 
set of identical items. We adopt the worst case competitive framework 
defined by [6,3] that compares the profit of an auction to that of an 
optimal single price sale to at least two bidders. In this framework, we 
give a lower bound of 2.42 (an improvement from the bound of 2 given 
in [3]) on the competitive ratio of any truthful auction, one where each 
bidders best strategy is to declare the true maximum value an item is 
worth to them. This result contrasts with the 3.39 competitive ratio of 
the best known truthful auction [4]. 



1 Introduction 

A combination of recent economic and computational trends, such as the negli- 
gible cost of duplicating digital goods and, most importantly, the emergence of 
the Internet as one of the most important arenas for resource sharing between 
parties with diverse and selfish interests, has created a number of new and inter- 
esting dynamic pricing problems. It has also cast new light on more traditional 
problems such as the problem of profit maximization for the seller in an auction. 

A number of recent papers [6,3,4] have considered the problem of designing 
auctions, for selling identical units of an item, that perform well in worst case 
under unknown market conditions. In these auctions, there is a seller with £ 
units for sale, and bidders each interested in obtaining one of them. Each bidder 
has a valuation representing how much the item is worth to them. The auction 
is performed by soliciting a sealed bid from each of the bidders, and deciding on 
the allocation of units to bidders and the prices to be paid by the bidders. The 
bidders are assumed to follow the strategy of bidding so as to maximize their 
personal utility, the difference between their valuation and the price they pay. 
To handle the problem of designing and analyzing auctions where bidders may 
falsely declare their valuations to get a better deal, we will adopt the solution 

^ Work was done while second author was at the University of Washingtion. 
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concept of truthful mechanism design (see, e.g., [6,11,9]). In a truthful auction, 
truth-telling, i.e, revealing their true valuation as their bid, is an optimal strategy 
for each bidder regardless of the bids of the other bidders. In this paper, we will 
restrict our attention to truthful (a.k.a., incentive compatible or strategyproof) 
auctions. 

In research on such auctions, a form of competitive analysis is used to gauge 
auction revenue. Specifically, a truthful auction’s performance on a particular 
bid vector is evaluated by comparing it against the profit that could be achieved 
by an “optimal” omniscient auction, one that knows the true valuations of the 
bidders in advance. An auction is f3- competitive if it achieves a profit that is 
within a factor of /3 > 1 of optimal on every input. The goal then becomes 
to design the auction with the best competitive ratio, i.e., the auction that is 
/^-competitive with the smallest possible value of (3. 

A particularly interesting special case of the auction problem is the unlimited 
supply case. In this case the number of units for sale is at least the number of 
bidders in the auction. This is natural for the sale of digital goods where there is 
negligible cost for duplicating and distributing the good. Pay-per-view television 
and downloadable audio files are examples of such goods. 

For the unlimited supply auction problem, the competitive framework intro- 
duced in [6] and further refined in [3] uses the profit of the optimal omniscient 
single priced mechanism that sells at least two units as the benchmark for com- 
petitive analysis. The assumption that two or more units are sold is necessary 
because in the worst case it is impossible to obtain a constant fraction of the 
profit of the optimal mechanism when it sells only one unit [6] . In this worst case 
competitive framework, the best known auction for the unlimited supply has a 
competitive ratio of 3.39 [4]. 

In this paper we also consider the case where the number of units for sale, 
£, is limited, i.e., less than the number of bidders. At the opposite extreme from 
unlimited supply, is the limited supply case with £ = 2.^ In this case the Vickrey 
auction [II], which sells to the highest bidder at the second highest bid value, 
obtains the optimal worst case competitive ratio of 2 [3] . 

The main result of this paper is a lower bound on the competitive ratio of 
any randomized auction. For £ = 2, this lower bound is 2 (this was originally 
proven in [3], though we give a much simpler proof of it here). For £ = 3, the 
lower bound is 13/6 « 2.17, and as £ grows the bound approaches 2.42 in the 
limit. We conjecture that this lower bound is tight. Yet, even in the case of three 
units, the problem of constructing the auction matching our lower bound of 13/6 
is open. 

The rest of the paper is organized as follows. In Section 2 we give the math- 
ematical formulation of the auction problem that we will be studying, and we 
describe the competitive framework that is used to analyze such auctions in 
worst case. In Section 3 we give our main result, a bound on how well any auc- 
tion can perform in worst case. In Section 4 we describe attempts to obtain a 
matching upper bound. 

^ Notice that the competitive framework is not well defined for the £ = 1 case as the 
optimal auction that sells at least two units cannot sell just one unit. 
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2 Preliminaries and Notation 

We consider single-round, sealed-bid auctions for a set of £ identical units. As 
mentioned in the introduction, we adopt the game theoretic solution concept of 
truthful mechanism design. A useful simplification of the problem of designing 
truthful auctions is obtained through the following algorithmic characterization. 
Related formulations to the one we give here have appeared in numerous places 
in recent literature (e.g., [2,10,3,7]). To the best of our knowledge, the earliest 
dates back to the 1970s [8]. 

Definition 1. Given a bid vector of n bids, b = (&i, . . . ,6„), let b_j denote the 
vector of with bi replaced with a i.e., 

b— i , bi—\^ ?, . . . , bji). 



Definition 2 (Bid-independent Auction, BI/). Let f be a function from 
bid vectors (with a ‘?’) to prices (non-negative real numbers). The deterministic 
bid-independent auction defined by /, BI/, works as follows. For each bidder i: 

1. Set ti = /(b_i). 

2. If ti < bi, bidder i wins at price ti 

3. If ti > bi, bidder i loses. 

4-. Otherwise, (U = bi) the auction can either accept the bid at price ti or reject 
it. 

A randomized bid-independent auction is a distribution over deterministic bid- 
independent auctions. 

The proof of the following theorem can be found, for example, in [3]. 

Theorem 1. An auction is truthful if and only if it is equivalent to a bid- 
independent auction. 

Given this equivalence, we will use the the terminology bid-independent and 
truthful interchangeably. We denote the profit of a truthful auction A on input 
b as A(b). This profit is given by the sum of the prices charged bidders that are 
not rejected. For a randomized bid-independent auction, A(b) and /(b_j) are 
random variables. 

It is natural to consider a worst case competitive analysis of truthful auctions. 
In the competitive framework of [3] and subsequent papers, the performance of 
a truthful auction is gauged in comparison to the optimal auction that sells at 
least two units. There are a number reasons to choose this metric for comparison, 
interested readers should see [3] or [5] for a more detailed discussion. 

Definition 3. The optimal single price omniscient auction that sells at least 
two units (and at most £ units), is defined as follows: Let h be a bid 

vector of n bids, and let Vi be the i-th largest bid in the vector h. Auction 
on b chooses k G {2,... ,£} to maximize kvk. The k highest bidders are each 
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sold a unit at price Vk (ties broken arbitrarily) ; all remaining bidders lose. Its 
profit is: 

2<k<l 

In the unlimited supply case, i.e., when £ = n, we define 

Definition 4. We say that auction A is /3-competitive if for all bid vectors b, 
the expected profit of A on b satisfies 

E[Fl(b)] > 

The competitive ratio of the auction A is the infimum of (3 for which the auction 
is ^-competitive. 

2.1 Limited Supply versus Unlimited Supply 

Throughout the remainder of this paper we will be making the assumption that 
n = £, i.e., the number of bidders is equal to the number of items for sale. The 
justification for this is that any lower bound that applies to the n = £ case also 
extends to the case where n> £. To see this, note that an £ item auction A that 
is /^-competitive for any n > £ bidder input must also be /3-competitive on the 
subset of all n bidder bid vectors that have n — £ bids at value zero. Thus, we can 
simply construct an A' that takes an £ bidder input b', augments it with n — £ 
zeros to get b, and simulates the outcome of A on b. Since £F^'^\h') = .F^^’^^(b), 
A' obtains at least the competitive ratio of A. 

In the other direction, a reduction from the unlimited supply auction problem 
to the limited supply auction problem given in [5] shows how to take an unlimited 
supply auction that is /3-competitive with and construct a limited supply 
auction parameterized by £ that is /^-competitive with 

Henceforth, we will assume that we are in the unlimited supply case, and we 
will examine lower bounds for limited supply problems by placing a restriction 
on the number of bidders in the auction. 

2.2 Symmetric Auctions 

In the remainder of this paper, we restrict attention to symmetrie auctions. An 
auction is symmetric if its output is not a function of the order of the bids in the 
input vector, b. We note that there is no loss of generality in this assumption, 
as the following result shows. 

Lemma 1. For any (3 -competitive asymmetric truthful auction there is a sym- 
metric randomized truthful auction with competitive ratio at least (3. 

Proof. Given a /3-competitive asymmetric truthful auction. A, we construct a 
symmetric truthful auction Af that first permutes the input bids b at random 
to get 7r(b) and then runs A on 7r(b). Note, .F^^^(b) = .F^^^(7r(b)) and since A 
is /3-competitive on 7r(b) for any choice of tt, Al is /^-competitive on b. ■ 
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2.3 Example: The Vickrey Auction 

The classical truthful auction is the 1-item Vickrey auction (a.k.a. the second 
price auction) . This auction sells to the highest bidder at the second highest bid 
value. To see how this fits into the bid-independent framework, note that the 
auction Blmax (the bid-independent auction with / = max) does exactly this 
(assuming that the largest bid is unique). 

As an example we consider the competitive ratio of the Vickrey auction in 
the case where there are only two bidders. Given two bids, b = { 6 i,& 2 }> the 
optimal single price sale of two units just sells both units for the smaller of the 
two bid values, i.e., the optimal profit is .^'(^^(b) = 2 min( 6 i, 62 )- Of course, the 
1-item Vickrey auction sells to the highest bidder at the second highest price 
and thus has a profit of min( 6 i, 62 ). Therefore, we have: 

Observation 1 The Vickrey auction on two bidders is 2- competitive. 

It turns out that this is optimal for two bidders. Along with the general lower 
bound of 2.42, in the next section we give a simplified proof of the result, origi- 
nally from [3], that no two bidder truthful auction is better than 2-competitive. 



3 A Lower Bound on the Competitive Ratio 



In this section we prove a lower bound on the competitive ratio of any truthful 
auction in comparison to ; we show that for any randomized truthful auction, 
A, there exists an input bid vector, b, on which 



E[A(b)] < 



Jf(2)(b) 

2.42 



In our lower bound proof we will be considering randomized distributions 
over bid vectors. To avoid confusion, we will adopt the following notation. A real 
valued random variable will be given in uppercase, e.g., X and Ti. In accordance 
with this notation, we will use Bi as the random variable for bidder i’s bid value. 
A vector of real valued random variables will be a bold uppercase letter, e.g., B 
is a vector of random bids. 

To prove the lower bound, we analyze the behavior of A on a bid vector 
chosen from a probability distribution over bid vectors. The outcome of the 
auction is then a random variable depending on both the randomness in A and 
the randomness in B. We will give a distribution on bidder bids and show that 

it satisfies Eb[E^[A(B)]] < We then use the following fact to claim 

that there must exist a fixed choice of bids, b (depending on A), for which 

E[A(b)] < 

Fact 1 Given random variable X and two functions f and g, E[/(A)] < 
E[ff(A)] implies that there exists x such that f{x) < g{x). 



As a quick proof of this fact, observe that if for all x, f{x) > g{x) then it would 
be the case that E[/(A)] > E[g(A)] instead of the other way around. 

A key step in obtaining the lower bound is in defining a distribution over bid 
vectors on which any truthful auction obtains the same expected revenue. 
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Definition 5. Let the random vector of bids be n i.i.d. bids generated from 
the distribution with each bid Bi satisfying Pr[i?i > z] = 1/z for all z > 1. 



Lemma 2 . For defined above, any truthful auction, A, has expected rev- 
enue satisfying. 



E 






< n. 



Proof. Consider a truthful auction A. Let Ti be the price offered to bidder i in 
the bid-independent implementation of A. is a random variable depending 
on A and B_^ and therefore Tj and are independent random variables. Let 
Pi be the price paid by bidder i, i.e., 0 if Bi < Ti and Ti otherwise. For t > 
0, E[Pi \ Ti = t] = t ■ Pr[i?i > t I Ti = t] = t ■ Pr[i?j > t] < 1, since Bi is 
independent of Ti. Therefore E[Pi] < 1 and E[^(B(”))] = — n. ■ 



For the input B^") an auction attempting to maximize the profit of the seller 
has no reason to ever offer prices less than one. The proof of the above lemma 
shows that any auction that always offers prices of at least one has expected 
revenue exactly n. 



3.1 The n — 2 Case 



To give an outline for how our main proof will proceed, we first present a proof 
that the competitive ratio for a two bidder auction is at least 2. Of course, the 
fact that the 1-item Vickrey auction achieves this competitive ratio means that 
this result is tight. The proof we give below simplifies the proof of the same 
result given in [3]. 



Lemma 3. E[j^( 2 )(B( 2 ))j ^ 4 ^ 



Proof. From the definition of = 2minB(^). Therefore, for 

z > 2, Pr > z] = Pr[i?i > z/2 A B 2 > z/2] = . Using the def- 

inition of expectation for non-negative continuous random variables of E[A1] = 
Pr[V > x] dx we have 



E 



pCX) 

J-(2)(b( 2))J =2-hy (4/z2)dz = 4. 



Lemma 4. The optimal competitive ratio for a two bidder auction is 2. 



The proof of this lemma follows directly from Lemmas 2 and 3, and Fact 1. 
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3.2 The General Case 



For the general case, as in the two bidder case, we must compute the expectation 
of 

Lemma 5. For n bids from the above distribution, the expected value of is 



E 






i-1 



n — 1 
i — 1 \ i — 1 



Proof. In this proof we will get a closed form expression for Pr[iF(^)(B (")) > z] 
and then integrate to obtain the expected value. Note that all bids are at 
least one and therefore, we will assume that z > n. Clearly for z < n, 
”^) > z] =1. Let Vi he a, random variable for the value of the ith 
largest bid, e.g., Vi = maxiSj. To get a formula for Pr[.7^(^)(B (”))], we define 
a recurrence based on the random variable F„^k defined as 

Fn,k = max(A: + i)Vi- 

i 



Intuitively, Fn,k represents the optimal single price revenue from B^"^ and an 
additional k consumers each of which has a value equal to the highest bid, Vi. 
To define the recurrence, fix n, k, and z and define the events 'Hi for 1 < f < n. 
Intuitively, the event Hi represents the fact that i bidders in B^”) and the k 
additional consumers have bid high enough to equally share z, while no larger 
set of j > i bidders of B^”^ can do the same. 



n 

Hi = Vt > z/{k + i) A f\ Vj<z/{k + j) 

j=i+l 

Pr[Hi\ = Pr[F„_i_fe+i < z] . 

Note that events Hi are disjoint and that Fn,k is at least z if and only if one 
of the Hi occurs. Thus, 



Pr[Fn,k > z] = Pr 



Ah. 



= IfPr|H. 






( 1 ) 



Equation (1) defines a two dimensional recurrence. The base case of this recur- 
rence is given by Fo^k = 0. We are interested in which is the same as 

Fn,o except that we ignore the Hi case. This gives 



Pr 



jr(2)(B(”)) > 



> 2:] - Pr[Hi] 

Pr[F„_o > -z] - ^Pr^[Fn-i,i < z] . 



(2) 
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To obtain Pr we can solve the recurrence for F„^k given by Equa- 

tion (1). We will show that the solution is: 



Pr[F„,fc > z] = 1 - 



z — k 



z — k — n 
z — k 



( 3 ) 



Note that (3) is correct for n = 0. We show that it is true in general induc- 
tively. Substituting in our proposed solution (3) into (1) we obtain: 



'PAPn,k > 



z — k — n 



k + i 
z 



z — k — i 
z 



z — k — n 
z — k — i 



E 



(fc + iy {z — k — i)" * ^ 



( 4 ) 



We now apply the following version of Abel’s Identity [1]: 

{x + yY 



i=o 






n-j 



Making the change of variables, j = n — i, x = z— k — n, and y = k + n we get: 



z — k 



i=0 






We subtract out the i = 0 term and plug this identity into (4) to get 
z — k — n 



^y^[Fn,k > z] = 



= 1 - 



z — k 



z — k — n 

z — k — n 
z — k 



- (z - ky 



Thus, our closed form expression for the recurrence is correct. 

Recall our goal is to compute Pr > z]. Equation (3) shows that 

P^{Fn, 0 > z] = n/z. This combined with Equation (2) and Equation (3) gives 
the following for z > n: 



Pr 



^(2)(b(«)) > ^ < z] 

= f Pr[F„_i,i > z] 



= ^ 1 - 



z- 1 



z — n 
z-1 



Recall that for z < n, Pr > z] = 1. To complete this proof, 

the formula E[Jp(2)(b("))] = /“ Pr [Jp(2)(b(")) > z] dz = n + 



we use 
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j-pr[^(2)(B(" ^) > z] dz. In the form above, this is not easily integrable; how- 
ever, we can transform it back into a binomial sum which we can integrate: 



Pr 



J7(2)(b(")) 



E 




Theorem 2. The competitive ratio of any auction on n bidders is 



n 



i-E 

1=2 






This theorem comes from combining Lemma 2, Lemma 5, and Fact 1. Of 
course, for the special case of n = 2 this gives the lower bound of 2 that we 
already gave. For n = 3 this gives a lower bound of 13/6. A lower bound for 
the competitive ratio of the best auction for general n is obtained by taking the 
limit. In the proof of the main theorem to follow, we use the following fact. 



Fact 2 For 1 < fc < AT, 0 < aj, < 1, then n^i(l ~ ^ 1 ~ 



Theorem 3. The competitive ratio of any auction is at least 2.42. 



Proof. We prove this theorem by showing that. 



lim 





2=2 



i 



( 5 ) 



After which, routine calculation shows that the right hand side of the above 
equation is at least 2.42 which gives the theorem. To prove that (5) holds, it is 
sufficient to show that 



‘ + E<-i)'7= 



(i- l)(i - 1)! 
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We proceed as follows: 



1 + D-1)'7T3 



2 = 2 



(*-!)(*- 1)! 



-h-E 



2=2 



-1 



2-1 



i fn—1 



i — 1 V ^ — 1 



2=2 

n 

= E 

2=2 

n 

-E 

2=2 

n 

^E 

2=2 

n 

^E 

2=2 



2-1 



i ( n — 1 



{i — l)(i — 1)! V^/ i — 1 \i — 1 

n(n — 1) • • • (n — i + 2) 



(*-!)(*- 1)! 



(*-!)(*- 1)! 



1 - 



1 



n 
2-2 



1 - 1 -- 1 -- ... 1 - 



i-2 



(*-!)(*- 1)! 



i=i 



(i — l)(i — 1)! \n 



= -E 

T7 ^ ^ 



^-E 

r) ^ ^ 






Since (i — 1)! grows exponentially, i® bounded by a constant and 

we have the desired result. ■ 



4 Lower Bounds versus Upper Bounds 

As mentioned earlier, the lower bound of 2.42 for large n does not match the 
competitive ratio of the best known auction (currently 3.39 [4]). In this section, 
we briefly consider the issue of matching upper bounds for small values of n. 
For n = 2 the 1-item Vickrey auction obtains the optimal competitive ratio of 
2 (see Section 2.3). It is interesting to note that for the n = 2 case the optimal 
auction always uses sale prices chosen from the set of input bids (in particular, 
the second highest bid). This motivates the following definition. 

Definition 6. We say an auction, A, is restricted if on any input the sale prices 
are drawn from the set of input bid values, unrestricted otherwise. 

While the Vickery auction is a restricted auction, the Vickrey auction with re- 
serve price r, which offers the highest bidder the greater of r and the second 
highest bid value, is not restricted as r may not necessarily be a bid value. 

Designing restricted bid-independent auctions is easier than designing general 
ones as the set of sale prices is determined by the input bids. However, as we 
show next, even for the n = 3 case the optimal restricted auction’s competitive 
ratio is worse than that of the optimal unrestricted auction. 

Lemma 6. For n = 3, no restricted truthful auction, Bly, can achieve a com- 
petitive ratio better than 5/2. 
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Proof. Because BI/ is restricted, f{a,b) G {a, 6}. For h> 1 and a > hb, let 

p = supPr[/(a,6) = b] . 

a,b 

For e close to zero, let a and b be such that a > hb and Pr[/(a, b) = b] >p—e. 

The expected revenue for the auction on {a, b + e', b} is at most b + e' + pb. 
Here, the b + e' an upper bound on the payment from the a bid and the pb is 
an upper bound on the expected from the & + e bid (as p is an upper bound 
on the probability that this bid is offered price b). Note that = 36 so the 
competitive ratio obtained by taking the limit as e' — >■ 0 is at least 3/(1 +p). 

An upper bound for the expected revenue for the auction on {a + e', a, 6} is 
2pb+{l—p+e)a. The pb+{l—p+e)a is from the a+e' and the pb is from the a bid. 
For large h, = 2a so the competitive ratio is at least 2h/{2pb + h{l — p + e)). 
The limit as e — >■ 0 and h ^ oo gives a bound on the competitive ratio of 

2/(1 -f). 

Setting these two ratios equal we obtain an optimal value of p = 1/5 which 
obtains a competitive ratio of 5/2. ■ 



This lower bound is tight as the following lemma shows. 
Lemma 7. For a > b, the bid-independent auction, Blj^ with 



f{a,b) 



b with probability 1/5 
a otherwise. 



achieves a competitive ratio o/5/2 for three bidders. 

We omit the proof as it follows via an elementary case analysis. It is interesting 
to note that the above auction is essentially performing a 1-item Vickrey auction 
with probability 4/5 and a 2-item Vickrey auction with probability 1/5. 

Lemma 8. An unrestricted three bidder auction can achieve a better competitive 
ratio than 5/2. 



Proof. For a > b, the bid independent auction BI/ with 



f{a,b) = < 




with probability 15/23 
with probability 8/23. 
with probability 3/23 
with probability 20/23. 



b < a < 36/2 
a > 36/2. 



has competitive ratio 2.3. We omit the elementary case analysis. ■ 



Recall that the lower bound on the competitive ratio for three bidders is 
13/6 « 2.17. Obtaining the optimal auction for three bidders remains an inter- 
esting open problem. 
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5 Conclusions 

We have proven a lower bound of 2.42 on the competitive ratio of any truthful 
auction. The algorithmic technique used, that of looking at distributions of bid- 
ders on which all auctions perform the same and bounding the expected value of 
the metric (e.g., is natural and useful for other auction related problems. 

There is a strange artifact of the competitive framework that we employ here 
(and that which is used in prior work [3,4]). As we showed, the optimal worst 
case auction for selling two items is the 1-item Vickrey auction. This auction only 
sells one item, yet we had two items. Our optimal restricted auction for three 
items never sells more that two items. Yet, under our competitive framework it 
is not optimal to run this optimal restricted auction for three items when there 
are only two items for sale. As it turns out, this is not a problem when using 
a different but related metric, Vopt, defined as the fc-item Vickrey auction that 
obtains the highest profit, i.e., Vopt(b) = maxi{i — 1)6^ (for bi > 6i+i). 

Acknowledgements. We would like to thank Amos Fiat for many helpful 
discussions. 
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In the paper [1] we have presented a tight analysis of so-called Harmonic algo- 
rithm for three servers, the claim being that the algorithm is 6-competitive. 

Unfortunately this analysis contains an error which we are not able to correct 
and thereby we have to withdraw our claim. 

The error is in the proof of Theorem 2.1, the analysis of the adversary move. 
We claim that the potential changes by the change of Hu/ when the adversary 
move the server 1' and subsequently bound this change. However, also the other 
components of the potential, namely H 22 ' and for three servers, may change 
as the metric spaces underlying these random walks do change due to the move 
of the server 1'. Thus, overall, the potential can change more than the paper 
claims and we cannot prove any improved bound on the competitive ratio using 
this potential. 

The framework of random walks (in particular the proofs of the inequalities 
(A) to (C) for three servers and (D) for any number of servers) remain untouched 
by this error. We still believe that this type of analysis may be useful for the 
analysis of Harmonic. However, the potential function has to be significantly 
different. 
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